Leading companies are using machine learning (ML) to power innovation across industries, including healthcare, automotive, and finance. In this session, learn how to build scalable ML solutions using the Amazon SageMaker platform, as well as our services for computer vision, language, and analytics. We also demonstrate real-world use cases for enterprises to get more value from their data and integrate and manage intelligent systems and processes.
Deep learning has the potential to enable extremely advanced AI applications. But it's not taught in most computer science programs, and you may have a lot of questions. In this session, understand how deep learning works, and learn key concepts such as neural networks, activation functions, and optimizers. We show you how deep learning models improve through complex pattern recognition in pictures, text, sounds, and other data to produce more accurate insights and predictions. We also share examples of common deep learning use cases, such as computer vision and recommendation models. Finally, we help you understand how to get started using popular deep learning frameworks, such as TensorFlow, Apache MXNet, and PyTorch.
Video-based tools have enabled advancements in computer vision, such as in-vehicle use cases for AI. However, it is not always possible to send this data to the cloud to be processed. In this session, learn how to train machine learning models using Amazon SageMaker and deploy them to an edge device using AWS Greengrass, enabling you process data quickly at the edge, even when there is no connectivity.
Amazon brings natural language processing, automatic speech recognition, text-to-speech services, and neural machine translation technologies within the reach of every developers. In this session, learn how to add intelligence to any application with machine learning services that provide language and chatbot functions. See how others are defining and building the next generation of apps that can hear, speak, understand, and interact with the world around us.
Analyzing customer service interactions across channels provides a complete 360-degree view of customers. By capturing all interactions, you can better identify the root cause of issues and improve first-call resolution and customer satisfaction. In this session, learn how to use machine learning to quickly process and analyze thousands of customer conversations to gain valuable insights. With speech and text analytics, you can pick up on emerging service-related trends before they get escalated or identify and address a potential widespread problem at its inception.
Amazon SageMaker comes preconfigured with popular deep learning frameworks, including TensorFlow. In this session, learn how to use TensorFlow with Amazon SageMaker, and dive into training and deploying machine learning models. Additionally, we discuss multi-GPU training using Amazon EC2 P3 instances, the highest-performing cloud infrastructure for deep learning.
Deep learning continues to push the state of the art in domains such as computer vision and natural language processing. In this session, learn to build deep learning applications using PyTorch, a fast and efficient deep learning library for running neural network models. We also demonstrate how to build, train, and deploy state-of-the-art models using PyTorch integrated with the Amazon SageMaker platform.
Amazon SageMaker, our fully managed machine learning platform, comes with pre-built algorithms and popular deep learning frameworks. Amazon SageMaker also includes an Apache Spark library that you can use to easily train models from your Spark clusters. In this code-level session, we show you how to integrate your Apache Spark application with Amazon SageMaker. We also dive deep into starting training jobs from Spark, integrating training jobs in Spark pipelines, and more.
Amazon SageMaker is a fully managed platform that enables developers and data scientists to quickly and easily build, train, and deploy machine learning models at any scale. In this session, learn how to use Amazon SageMaker's pre-built algorithms in key use cases, such as financial forecasting and predicting outcomes in healthcare.
Natural language processing holds the key to unlocking business value from unstructured data. Organizations that implement effective data analysis methods gain a competitive advantage through improved decision-making, risk reduction, or enhanced customer experience. In this session, learn how to easily process, analyze, and visualize data by pairing Amazon Comprehend with services like Amazon Relational Database Service (Amazon RDS), Amazon Elasticsearch Service, and Amazon Neptune. We also share real-world examples of how customers built text analytics solutions with Amazon Comprehend.
Machine learning (ML) enables developers to build scalable solutions that maximizes the use of media assets through automatic metadata extraction. From automatic transcription and language translation to face detection and celebrity recognition, ML enables you to automate manual workflows and optimize the use of your video content. In this session, learn how to use services such as Amazon Rekognition, Amazon Translate, and Amazon Comprehend to build a searchable video library, automate the creation of highlight reels, and more.
Deep learning can be used to help make targeted and personalized product and content recommendations. In this session, learn how to build a large-scale recommendation system using MXNet within the Amazon SageMaker machine learning platform. In this code-level session, we also show tutorials and examples of successful recommendation systems, such as the one that powers Amazon.com.
The transformation of the auto industry from manufacturers to mobility providers is centered on seamlessly and safely connecting vehicles to the outside world. In this session, we discuss how customers are using AWS for a variety of connected vehicle use cases. Leave this session with source code, architecture diagrams, and an understanding of how to use the AWS connected vehicle reference architecture to build your own prototypes. Also learn how companies leverage Amazon services such as Alexa, AWS IoT, AWS Greengrass, AWS Lambda, and Amazon Kinesis Data Analytics to rapidly develop and deploy innovative mobility services. Learn how to use new enhancements in your architectures, including the IoT Device Simulator, a scalable, simulated vehicle, load generation tool, as well as the AWS IoT Framework for Automotive Grade Linux (AGL), an integrated build tool for AGL that includes the AWS IoT Device SDK and AWS Greengrass.
Vehicle mobility is evolving, from traditional rental and fleet services, to car sharing, ride hailing, and future driverless services. Mobility providers need an agile, scalable, digital platform to manage all aspects of their fleet and its usage. In this session, Avis Budget Group (ABG) and Slalom walk through their serverless mobility platform using the AWS connected vehicle reference architecture, Amazon SageMaker, Amazon Kinesis Data Analytics, and AWS Lambda. Learn the practical application of using AWS IoT to connect vehicles and Amazon SageMaker to apply ML to uncover insights for use cases, including vehicle inventory, shuttling efficiency, driver behavior, and vehicle trajectory analysis to identify fraudulent vehicle usage. We dive deep into the overall solution and services mentioned above, as well as the operations dashboard ABG created with Uber's open source framework, deck.gl.
Automotive companies are building next generation connectivity platforms on AWS to take advantage of the advanced analytics and auto-scale features of the cloud. In this workshop, we walk through the use cases demonstrated in the AWS connected vehicle solution, such as anomaly detection and trip aggregation processing, as well as the core services in the solution: AWS IoT, AWS Greengrass, AWS Lambda, Amazon DynamoDB, Amazon Kinesis Data Analytics, and Amazon S3. Participants will deploy the connected vehicle solution using an AWS CloudFormation template, and get hands-on experience deploying the AWS IoT Framework for Automotive Grade Linux (AGL) on automotive-grade hardware. By the end of the session, attendees will have a working solution for publishing data from the device to the AWS connected vehicle solution deployed in their accounts, and they can begin customizing with their own devices.
In this session, we simplify data lakes and analytics processing as a data bus, comprising various stages: collect, store, process, analyze, and visualize. Next, we discuss how to choose the right technology in each stage based on criteria such as data structure, query latency, cost, request rate, item size, data volume, durability, and so on. Finally, we provide reference architectures, design patterns, and best practices for assembling these technologies to solve your big data problems at the right cost.
Most companies are overrun with data, yet they lack critical insights to make timely and accurate business decisions. They are missing the opportunity to combine large amounts of new, unstructured big data that resides outside their data warehouse with trusted, structured data inside their data warehouse. In this session, we discuss the most common use cases with Amazon Redshift, and we take an in-depth look at how modern data warehousing blends and analyzes all your data to give you deeper insights to run your business.
Did you know that you can use Amazon Elasticsearch Service (Amazon ES) for log analytics, full text search, and application monitoring? In this session, learn about and experience Amazon ES. We will cover the ways you can use Amazon ES with live demonstrations and customer examples.
Did you know you can use Amazon EMR to do clickstream analysis, real-time analytics, log analysis, ETL, predictive analytics, and genomics? Come to this session to learn about Amazon EMR—the AWS Hadoop and Spark offering—and learn how it can help your business.
In this talk, Anurag Gupta, VP for AWS Analytic and Transactional Database Services, talks about some of the key trends we see in data lakes and analytics, and he describes how they shape the services we offer at AWS. Specific trends include the rise of machine-generated logs as the dominant source of data, the move towards serverless, SPI-centric computing, and the growing need for local access to data from users around the world.
As Amazon's consumer business continues to grow, so does the volume of data and the number and complexity of the analytics done in support of the business. In this session, we talk about how Amazon.com uses AWS technologies to build a scalable environment for data and analytics. We look at how Amazon is evolving the world of data warehousing with a combination of a data lake and parallel, scalable compute engines, such as Amazon EMR and Amazon Redshift.
There are many serverless approaches to processing streaming data in real time, including AWS Lambda, Amazon Kinesis Data Analytics, and Amazon Kinesis Data Firehose. Amazon Kinesis makes it easy to collect, process, and analyze real-time, streaming data so you can get timely insights and react quickly to new information. In this session, we cover how to implement common use cases using serverless architectures, including streaming ETL, continuous metrics generation, and responsive analytics. We also cover which services to use for which use cases, and we discuss best practices for each.
Amazon Kinesis Video Streams makes it easy to capture live video, play it back, and store it for real-time and batch-oriented ML-driven analytics. In this session, we first dive deep into the top five best practices for getting started and scaling with Kinesis Video Streams. Next, we do a live demonstration of streaming video from a standard USB camera connected to a laptop, and we perform the live playback on a standard browser within minutes. We also have on stage Nirovision—a company that is building smart solutions to monitor businesses worldwide using Kinesis Video Streams. A representative from Nirovision walks through the technical details of this integration, highlighting the successes and difficulties they encountered along the way.
Companies have valuable data that they might not be analyzing due to the complexity, scalability, and performance issues of loading the data into their data warehouse. With the right tools, you can extend your analytics to query data in your data lake—with no loading required. Amazon Redshift Spectrum extends the analytic power of Amazon Redshift beyond data stored in your data warehouse to run SQL queries directly against vast amounts of unstructured data in your Amazon S3 data lake. This gives you the freedom to store your data where you want, in the format you want, and have it available for analytics when you need it. Join a discussion with AWS solution architects to ask questions and learn more about how you can extend your analytics beyond your data warehouse.
In this lab, we use a scheduled AWS Lambda function to pull data from the describe calls for Amazon RDS, Amazon DynamoDB, Amazon EC2, AWS CloudFormation, IAM, Elastic Load Balancing, Amazon Kinesis, Amazon Redshift, and more into Amazon Elasticsearch Service (Amazon ES). We couple this data with AWS CloudTrail data, and dig into it with Kibana to give you a real-time window into your AWS usage.
Amazon Elasticsearch Service (Amazon ES) is both a search solution and a log monitoring solution. In this session, we address both. We build a front-end, PHP web server that provides a search experience on movie data as well as backend monitoring to send Apache web logs, syslogs, and application logs to Amazon ES. We tune the relevance for the search experience and build Kibana visualizations for the log data. In addition, we use security best practices and deploy everything into a VPC.
In this session, we demonstrate how to use AWS Step Functions and Apache Livy on Amazon EMR to submit parallel Spark jobs on Amazon EMR. We provide all artifacts needed to build this solution.
A modern service consists of many microservices working together. But how do you know they are all working together? How can you outwit entropy? The answer, as with so many issues, is logs, plus a strong and automated log analyzer. In this session, we show how you can manage workflow and error management by using logs and Amazon Elasticsearch Service (Amazon ES) to create massively scalable, yet manageable, microservices solutions.
In this session, we demonstrate how to use AWS Glue to build ETL pipelines to and from Amazon Relational Database Service (Amazon RDS), Amazon Redshift, and Amazon S3. We provide all needed artifacts and sample code.
In this session, learn how to build a robust and scalable data warehouse very quickly without a large team of experts.
Organizations need to gain insight and knowledge from a growing number of IoT, APIs, clickstreams, and unstructured and log data sources. However, organizations are also often limited by legacy data warehouses and ETL processes that were designed for transactional data. In this session, we introduce key ETL features of AWS Glue, we cover common use cases ranging from scheduled nightly data warehouse loads to near real-time, event-driven ETL pipelines for your data lake. We also discuss how to build scalable, efficient, and serverless ETL pipelines using AWS Glue.
As data volumes grow and customers store more data on AWS, they often have valuable data that is not easily discoverable and available for analytics. Learn how AWS Glue makes it easy to build and manage enterprise-grade data lakes on Amazon S3. AWS Glue can ingest data from variety of sources into your data lake, clean it, transform it, and automatically register it in the AWS Glue Data Catalog, making data readily available for analytics. Learn how you can set appropriate data governance policies in the Data Catalog and make data available for a variety of use cases, such as run ad-hoc analytics in Amazon Athena, run queries across your data warehouse and data lake with Amazon Redshift Spectrum, run big data analysis in Amazon EMR, and build machine learning models with Amazon SageMaker and AWS Glue.
Amazon Kinesis makes it easy to increase the time it takes to get valuable, real-time insights from your data. In this session, we walk through the most popular applications that customers implement using Kinesis, including streaming extract-transform-load (ETL), continuous metric generation, and responsive analytics. For each application, we compare implementation options for each and discuss how to decide which AWS service to use. Each implementation option covers different Kinesis integrations, including AWS Lambda, Amazon Elasticsearch Service, Amazon S3, Amazon EMR, AWS Glue, and AWS database services.
One of the biggest trade-offs customers usually make when deploying BI solutions at scale is agility versus governance. Large-scale BI implementations with the right governance structure can take months to design and deploy. In this session, learn how you can avoid making this trade-off using Amazon QuickSight. Learn how to easily deploy Amazon QuickSight to thousands of users using Microsoft Active Directory and Federated SSO, while securely accessing your data sources in Amazon VPCs or on-premises. We also cover how to control access to your datasets, implement row-level security, create scheduled email reports, and audit access to your data.
Customers are migrating their analytics, data processing (ETL), and data science workloads running on Apache Hadoop, Spark, and data warehouse appliances from on-premises deployments to AWS in order to save costs, increase availability, and improve performance. AWS offers a broad set of analytics services, including solutions for batch processing, stream processing, ML, data workflow orchestration, and data warehousing. In this session, we focus on identifying the components and workflows in your current environment and providing the best practices to migrate these workloads to the right AWS data analytics product. We cover such services as Amazon EMR, Amazon Athena, Amazon Redshift, Amazon Kinesis, and more.
In this session, learn how to set up your data lake on Amazon S3 and automatically catalog it in the AWS Glue Data Catalog with AWS Glue crawlers. Also learn how to auto-generate an AWS Glue ETL script, download it, and interactively edit it in Amazon SageMaker, connected to an AWS Glue development endpoint. After that, we show you how to deploy this script into production by adding appropriate scheduling and triggering conditions and creating an AWS CloudFormation template. The resulting datasets will automatically get registered in the AWS Glue Data Catalog, and you can then query these new datasets from Amazon EMR and Amazon Athena. Prerequisites: Knowledge of Python and familiarity with big data applications is preferred but not required. Attendees must bring their own laptops.
From data lakes to machine learning, many analytic applications rely on orchestration of the tasks in the pipeline to avoid any expensive missteps. In this session, we discuss design patterns for orchestrating ETL pipelines using Apache Oozie and Apache Airflow, and services like AWS Lambda and AWS Step Functions that enable customers to not only create a seamless flow of automated processing tasks but also easily deploy different versions of their big data application.
Security is always a top priority for organizations building analytic applications on the cloud. Amazon EMR provides a comprehensive set of features to help secure Hadoop cluster resources and data. In this workshop, learn how to secure an Amazon EMR cluster using IAM roles and policies, Kerberos, Encryption (at rest and in transit), LDAP authentication in Hue and Presto, Shiro authentication in Zeppelin and Apache Ranger to provide a granular-level security and auditing for common services running on Amazon EMR.
In this session, we discuss some of the challenges, pitfalls, and patterns to be aware of when designing a data lake on the AWS Cloud. We also explain when to use which service in the AWS big data portfolio of services and how to evolve a big data architecture to handle changing requirements. We also share well-architected practices on data lakes from various customers.
In this session, we take a deep dive on tips and tricks with AWS Glue for optimal performance, avoiding memory exceptions, and more.
As data exponentially grows in organizations, there is an increasing need to use machine learning (ML) to gather insights from this data at scale and to use those insights to perform real-time predictions on incoming data. In this workshop, we walk you through how to train an Apache Spark model using Amazon SageMaker that is pointed to Apache Livy and running on an Amazon EMR Spark cluster. We also show you how to host the Spark model on Amazon SageMaker to serve a RESTful inference API. Finally, we show you how to use the RESTful API to serve real-time predictions on streaming data from Amazon Kinesis Data Streams.
Companies are implementing data lakes on Amazon S3 to build unified data access platforms. Join our discussion to ask questions and learn more about best practices around structuring your data in raw and processed zones in your data lake, partitioning this data, and automatically cataloging it. We cover best practices around bucket structure, file sizes, and file formats for your data lake on Amazon S3. We also cover how to set up an AWS Glue crawler to automatically scan your data lake and build your AWS Glue Data Catalog. Additionally, we explain how crawlers run periodically to keep your table definitions up-to-date so they can be easily queried from Amazon Athena and Amazon Redshift Spectrum. Lastly, we talk about writing Grok custom classifiers to identify log files and categorize them in your catalog.
In this session, we demonstrate how to build a Spark ML pipeline on Amazon EMR and deploy the model to Amazon SageMaker to serve real-time predictions. We provide all artifacts needed to build this solution.
In this workshop, get introduced to the AWS tools and technologies you can use to analyze and extract value from petabyte-scale datasets, including Amazon Redshift Spectrum.
Amazon Kinesis makes it easy to collect, process, and analyze real-time, streaming data so you can get timely insights and react quickly to new information. In this session, we walk through best practices for using Kinesis at scale. We cover how to efficiently write and read data from Amazon Kinesis Data Streams, how to elastically scale a stream up and down, how to properly scale applications regardless of the consumer type you have chosen, and more.
With the simplicity of Amazon Elasticsearch Service (Amazon ES) comes a multitude of opportunities to use it as a backend for real-time application and infrastructure monitoring. With this wealth of opportunities comes the potential for sprawl—developers in your organization deploying Amazon ES for many different workloads and many different purposes. In this case, should you centralize into one Amazon ES domain? What are the trade-offs in scale and cost? How do you control access to the data and dashboards? In this session, we explore whether, when, and how to centralize logging across your organization to minimize cost and maximize ROI.
Amazon Athena is an interactive query service that enables you to process data directly from Amazon S3 without the need for infrastructure. Since its launch at re:invent 2016, several organizations have adopted Athena as the central tool to process all their data. In this talk, we dive deep into the most common use cases, including working with other AWS services. We review the best practices for creating tables and partitions and performance optimizations. We also dive into how Athena handles security, authorization, and authentication. Lastly, we hear from a customer who has reduced costs and improved time to market by deploying Athena across their organization.
One of the benefits of having a data lake is that the same data can be consumed by multiple users and groups using a single computation engine. Multi-tenancy is one of the popular usages of a persistent Amazon EMR cluster. In this session, we discuss how to make an Amazon EMR cluster multi-tenant for analytics. We also share best practices for a multi-tenant cluster, and we explore some of the common challenges and mitigation strategies. We also look at the security aspects of a multi-tenant Amazon EMR cluster.
Amazon Elasticsearch Service (Amazon ES) makes it easy to deploy and use Elasticsearch in the AWS Cloud. It’s so easy that you might feel unsure that you are at the correct scale. You might be wondering how to optimize your expenditure or what best practices to use to secure your domain. In this session, we share key insights into Elasticsearch scale and security so you can deploy with confidence.
As customers build data lakes on AWS, securing access to data has become a priority. Once the data makes it into Amazon S3, there are multiple processing engines that can access the data through a SQL interface or programmatically. Federated access to data is an important requirement for enterprise customers. In this workshop, we focus on services related to Amazon EMR, and we demonstrate best practices to achieve unified authentication, authorization, encryption, and auditing capabilities. The AWS services we cover in this session include Amazon S3, IAM, AWS KMS, AWS CloudTrail, AWS Directory Service for Microsoft Active Directory, Amazon EMR, AWS Glue. We also use LDAP authentication.