If we have a standalone Spark installation running in our localhost with a Installation instructions for you specific setup here.Ī good way of using these notebooks is by first cloning the repo, and then ![]() InstructionsĪs we already mentioned, for these series of tutorials/notebooks, we have used Jupyter with the IRkernel R kernel. In order to continue with this tutorial, you will need to take that first, in order to have the data downloaded locally.Īll the code for these series of Spark and R tutorials can be found in its own GitHub repository. In the first tutorial, we explained how to download files and start a Jupyter notebook using SparkR. We will work with IPython/Jupyter notebooks here. We will do two things, read data into a SparkSQL data frame, and have a quick look at the schema and what we have read. ![]() In this second tutorial (see the first one) we will introduce basic concepts about SparkSQL with R that you can find in the SparkR documentation, applied to the 2013 American Community Survey dataset.
0 Comments
Leave a Reply. |