• Register
0 votes
468 views

Problem :

I am getting bellow error while trying to run pyspark on my macbook air

exception: java gateway process exited before sending the driver its port number

6 5 3
6,930 points

2 Answers

0 votes

Solution:

One possible reason behind getting this error is JAVA_HOME is not set or another reason may be the java is not installed.

I also had the same issue. As below

Exception in thread "main" java.lang.UnsupportedClassVersionError: org/apache/spark/launcher/Main : Unsupported major.minor version 51.0 at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:643) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:277) at java.net.URLClassLoader.access$000(URLClassLoader.java:73) at java.net.URLClassLoader$1.run(URLClassLoader.java:212) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:205) at java.lang.ClassLoader.loadClass(ClassLoader.java:323) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:296) at java.lang.ClassLoader.loadClass(ClassLoader.java:268) at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:406) Traceback (most recent call last): File "<string>", line 1, in <module> File "/opt/spark/python/pyspark/conf.py", line 104, in __init__ SparkContext._ensure_initialized() File "/opt/spark/python/pyspark/context.py", line 243, in _ensure_initialized SparkContext._gateway = gateway or launch_gateway() File "/opt/spark/python/pyspark/java_gateway.py", line 94, in launch_gateway raise Exception("Java gateway process exited before sending the driver its port number") Exception: Java gateway process exited before sending the driver its port number

at sc = pyspark.SparkConf(). I solved it by running

sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java8-installer

Further Readings:

https://github.com/jupyter/notebook/issues/743

9 7 4
38,600 points
0 votes

Solution:

One solution is including pyspark-shell to the shell environment variable PYSPARK_SUBMIT_ARGS:

export PYSPARK_SUBMIT_ARGS="--master local[2] pyspark-shell"

There is a alter in python/pyspark/java_gateway.py , which asserts PYSPARK_SUBMIT_ARGS includes pyspark-shell in case a PYSPARK_SUBMIT_ARGS variable is set by a user.

One probable cause is JAVA_HOME is not set since java is not installed.

I encountered the similar problem. It tells

Exception in thread "main" java.lang.UnsupportedClassVersionError: org/apache/spark/launcher/Main : Unsupported major.minor version 51.0
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClass(ClassLoader.java:643)
    at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
    at java.net.URLClassLoader.defineClass(URLClassLoader.java:277)
    at java.net.URLClassLoader.access$000(URLClassLoader.java:73)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:212)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:296)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
    at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:406)
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/opt/spark/python/pyspark/conf.py", line 104, in __init__
    SparkContext._ensure_initialized()
  File "/opt/spark/python/pyspark/context.py", line 243, in _ensure_initialized
    SparkContext._gateway = gateway or launch_gateway()
  File "/opt/spark/python/pyspark/java_gateway.py", line 94, in launch_gateway
    raise Exception("Java gateway process exited before sending the driver its port number")
Exception: Java gateway process exited before sending the driver its port number

at sc = pyspark.SparkConf(). I resolved it by running

sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java8-installer

What was missing in my instance was fixing the master URL in the $PYSPARK_SUBMIT_ARGS environment like this (presuming you use bash):

export PYSPARK_SUBMIT_ARGS="--master spark://<host>:<port>"

For Example

export PYSPARK_SUBMIT_ARGS="--master spark://192.168.2.40:7077"

You can place this into your .bashrc file. You will obtain the right URL in the log for the spark master (the location for this log is reported at the time you begin the master with /sbin/start_master.sh).

Afterwards consuming hours and hours attempting many diverse solutions, I can ensure that Java 10 SDK causes this error. On Mac, please navigate to /Library/Java/JavaVirtualMachines then run this command to uninstall Java JDK 10 fully:

sudo rm -rf jdk-10.jdk/

Attempt including pyspark-shell to the shell environment variable PYSPARK_SUBMIT_ARGS:

export PYSPARK_SUBMIT_ARGS="--master local[2] pyspark-shell"

There is a alter in python/pyspark/java_gateway.py , which claims PYSPARK_SUBMIT_ARGS that adds pyspark-shell at the time a PYSPARK_SUBMIT_ARGS variable is place by a user.

One probable cause maybe that the JAVA_HOME is not fix since java is not installed.

In that instance you will assignation something like this:

Exception in thread "main" java.lang.UnsupportedClassVersionError:

........

........

........

    raise Exception("Java gateway process exited before sending the driver its port number")

You can solve it by running the follwing commands:

sudo add-apt-repository ppa:webupd8team/java

sudo apt-get update

sudo apt-get install oracle-java8-installer

 

10 6 4
31,120 points

Related questions

0 votes
1 answer 76 views
76 views
Problem : I am getting bellow error while trying to run pyspark. java gateway process exited before sending the driver its port number
asked Oct 31, 2019 peterlaw 6.9k points
1 vote
1 answer 222 views
222 views
Problem: I got this Spark connection issue, and SparkContext didn't work for sc. The command to initialize ipython notebook: ipython notebook --profile=pyspark Environment: Mac OS Python 2.7.10 Spark 1.4.1 java version "1.8.0_65" Can anyone help?
asked Mar 27 LizzyM 6.1k points
2 votes
1 answer 48 views
48 views
Problem: I got this Spark connection issue, and SparkContext didn't work for sc. The command to initialize ipython notebook: ipython notebook --profile=pyspark Environment: Mac OS Python 2.7.10 Spark 1.4.1 java version "1.8.0_65" Can anyone explain or help?
asked Mar 24 LizzyM 6.1k points
0 votes
1 answer 53 views
53 views
Problem: I have recently started learning pyspark. I am very new to pyspark. I have written below code but it sowing me the error. from pyspark import SparkContext, SparkConf from pyspark.sql import SQLContext conf = SparkConf().setAppName("myApp").setMaster("local") sc ... 'ind', "state"]) Attributeerror: 'pipelinedrdd' object has no attribute 'todf' Can somebody help me in fixing my above code?
asked Aug 10 Raphael Pacheco 4.9k points