No
Yes
View More
View Less
Working...
Close
OK
Cancel
Confirm
System Message
Delete
Schedule
An unknown error has occurred and your request could not be completed. Please contact support.
Reserved
You've been added as a Walk-up
Personal Calendar
 
Conference Event
Meeting
Interests
There aren't any available sessions at this time.
Conflict Found
This session is already scheduled at another time. Would you like to...
Loading...
Please enter a maximum of {0} characters.
{0} remaining of {1} character maximum.
Please enter a maximum of {0} words.
{0} remaining of {1} word maximum.
must be 50 characters or less.
must be 40 characters or less.
Session Summary
We were unable to load the map image.
This has not yet been assigned to a map.
Search Catalog
Reply
Replies ()
Search
New Post
Microblog
Microblog Thread
Post Reply
Post
Your session timed out.
Meeting Summary

ABD403 - Best Practices for Distributed Machine Learning and Predictive Analytics Using Amazon EMR and Open-Source Tools

Session Description

This session, we focus on common use cases and design patterns for predictive analytics using Amazon EMR. We address accessing data from a data lake, extraction and preprocessing with Apache Spark, analytics and machine learning code development with notebooks (Jupyter, Zeppelin), and data visualization using Amazon QuickSight. We cover other operational topics, such as deployment patterns for ad hoc exploration and batch workloads using Spot and multi-user notebooks. The intended audience for this session includes technical users who are building statistical and data analytics models for the business using tools, such as Python, R, Spark, Presto, Amazon EMR, Notebooks.

Session Speakers
Additional Information
Aria
Breakout Session
Analytics & Big Data
Expert (400 level)
Please note that session information is subject to change.
Media