You may package a JAR and use spark-submit to submit the code to Spark. But sometimes you want to hack and sometimes this hack might be long running query. How do you keep spark-shell running after you have gone home?
It took me some fiddling but this works (with a bit of help from StackExchange).
In Unix shell #1, do:
mkfifo my_pipe
nohup spark-shell YOUR_CONFIG < my_pipe > YOUR_OUTPUT_FILE
Now in Unix shell #2, do:
nohup cat YOUR_SPARK_SCALA > my_pipe 2>/dev/null &
You should now see the Spark shell jump into life.
Now, back in shell #1, press CTRL+z, type:
jobs
identify your application's job ID and type
bg JOB_ID
Alternatively, following the advice in this StackOverflow answer, you can, CTRL+z, find the JOB_ID from jobs, then bg as above before calling:
disown -h %JOB_ID
You may now logoff and go home.
No comments:
Post a Comment