Quantcast
Channel: Big Data Modeling
Browsing all 3 articles
Browse latest View live

Visualizing Geohash

I recently had to process data about places, or points of interest, around the globe. It was intuitive to me to try organize these records by their location. The standard way to group hadoop records is...

View Article



Crawling the Web with Nutch on Amazon Elastic Map Reduce (EMR).

Nutch and EMRThe obvious choice to crawl the web seems to be Nutch these days. It is an mature apache project which span off Hadoop and Lucene, two arguably more successful apache projects.Another...

View Article

Using SparkNotebook for Kaggle Data Analysis - Part I

Some Kaggle challenges have data that does not fit in the memory of personal computers or they do fit but are painfully slow to do any exploratory or predictive data analysis.Spark Notebook is an open...

View Article
Browsing all 3 articles
Browse latest View live




Latest Images