Overhead imagery from satellites and drones have entered the mainstream of how we explore, understand, and tell stories about our world. They are undeniable and arresting descriptions of cultural events, environmental disasters, economic shifts, and more. Data scientists recognize that their value goes far beyond anecdotal storytelling. It is unstructured data full of distinctive patterns in a high dimensional space. With machine learning, we can extract structured data from the vast set of imagery available. RasterFrames extends Spark SQL with a strong Python API to enable processing of satellite, drone, and other spatial image data. This talk will discuss the fundamentals ideas to make sense of this imagery data. We will discuss how RasterFrames custom DataSource exploits convergent trends in how public and private providers publish images. Through deep Spark SQL integration, RasterFrames lets users consider imagery and other location-aware data sets in their existing data pipelines. RasterFrames fully supports Spark ML and interoperates smoothly with TensorFlow, Keras, and PyTorch. To crystallize these ideas, we will discuss a practical data science case study using overhead imagery in PySpark.
Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business.
Read more here:
Connect with us: