Why AWS Loves and Hates Data Gravity


I received the e-mail above from Amazon Web Services after recently signing up for another test account. The e-mail had me thinking about the impact of data gravity on AWS, both positively and negatively. For those who are new to the term, data gravity is a concept first coined by Dave McCrory, current CTO of Basho. It refers to the idea that “As Data accumulates (builds mass) there is a greater likelihood that additional Services and Applications will be attracted to this data.” McCrory attributes this data gravity phenomenon to “Latency and Throughput, which act as the accelerators in continuing a stronger and stronger reliance or pull on each other.” This is so because the closer services and applications are to their data, i.e. in the same physical facility, the lower the latency and higher the throughput. This in turn enable more useful and reliable services and applications.


A second characteristic of data gravity is that as more data is accumulated, the more difficult it is to move that data. That’s the reason services and applications tend to coalesce around data. The further you try to move data and the more data you try to move, the harder it is to do because latency increases and throughput decreases. This is know as the “speed of light problem.” Practically, this means that at a certain capacity, it becomes extremely difficult or too costly to try and move data to another facility, such as a cloud provider.

Data gravity, therefore, represents both a challenge and and opportunity for Amazon Web Services. Given that the vast majority of data today live outside of AWS data centers and have been accumulating for sometime in locations such as customer data centers, data gravity becomes a major challenge for AWS adoption by established enterprises. This of course is something AWS must overcome to continue their growth beyond startups and niche workloads in enterprises. If AWS is able to remove the barriers to migrating data into their facilities, they can then turn data gravity into an advantage and an opportunity.

The opportunity that data gravity affords AWS is to continue and to extend their dominance as a cloud provider. As users store more data within AWS services such as S3 and EBS, data gravity kicks in and users find it often easier and more efficient to use additional AWS services to leverage that data more fully. This creates, for Amazon, a “virtuous cycle” where data gravity opens up opportunities for more AWS services to be used, which generates more data, that then opens up more services to be consumed.

Data gravity and the need to both overcome and to utilize it is the reason so many AWS services is focused on data and how it can be more easily moved to AWS or how it can be more fully leveraged to produce additional value for customers. Take a look below at some of the many services that are particularly designed to attenuate or to accentuate data gravity.


Athena Query service for analyzing S3 data
Aurora MySQL compatible highly performant relational database
CloudFront Global content delivery network to accelerate content delivery to users
Data Pipeline Orchestration service for reliably processing and moving data between compute and storage services
Database Migration Service Migrates on-premises relational databased to Amazon RDS
DynamoDB Managed NoSQL database service
Elastic Block Storage Persistent block storage volumes attached to EC2 instances
Elastic File System Scalable file storage that can be mounted by EC2 instances
Elastic Map Reduce Managed Hadoop framework for process large scale data
Glacier Low-cost storage for data archival and long-term backups
Glue Managed ETL service for moving data between data stores
Kinesis Service for loading and analyzing streaming data
Quicksight Managed business analytics service
RDS Managed relational database service
Redshift Petabyte scale managed data warehouse service
S3 Scalable and durable object storage for storing unstructured files
Snowball Petabyte scale service using appliances to transfer data to and from AWS
Snowmobile Exabyte scale service using shipping containers to transfer data to  and from AWS
Storage Gateway Virtual appliance providing hybrid storage between AWS and on-premises environments

So what are some takeaways as we consider AWS and it’s love/hate relationship with data gravity? Here are a few to consider:

  • If you are an enterprise that wants to migrate to AWS but is being held back by data gravity in your data center, expect that AWS will innovate beyond services like the Snowball and the Snowmobile to make migration of large data sets easier.
  • If you are a user who is “all-in” on AWS and has either created and/or migrated all or most of your data to AWS, the good news is that you will continue to see an ever growing number of services that will allow you to gain more value from that data.
  • If you are user who is concerned about vendor/cloud provider lock-in, you need to consider carefully the benefits and consequences of creating and/or moving large amount of data on AWS or using higher level services such as RDS and Amazon Redshift. (As an aside, the subject of lock-in is probably worth a dedicated blog post since I believe that it is often misunderstood. In brief, each user should consider if the benefits of being locked-in may be greater than the perceived liability, e.g, what if lock-in potentially costs me $1 million but generate $2 million in revenues over the same time period. Opportunity cost is difficult to calculate and generally ignored in the ROI models I see.
  • Finally, If you an AWS partner or an individual who want to work at AWS or an AWS partner, put some focus on (in addition to security and Lambda) storage, analytics, database, and data migration services since they are all strategic for Amazon in how they deal with the positive and negative impact of data gravity. This was in evidence at the most re:Invent conference where much of the focus was placed on storage and database services such as EFS, snowmobile and Aurora.

Given its importance and critical impact, AWS observers should keep a careful eye on what Amazon will continue to do to both overcome and to leverage data gravity. It may very well dictate the future success of Amazon Web Services.


Welcome to The Learning AWS Blog!


I have a confession to make…

I was late to the party when it came to understanding the impact of the public cloud. I was tangentially aware of Amazon, the online book seller I used, getting into the virtual machine “rental” business in 2007. But as a technologist in the northeast, I heard very little about Amazon Web Services in my daily dealings with enterprise customers. To me, AWS was attempting to be a hosting provider targeting start-ups and small businesses looking to save money on their IT spend.

It wasn’t until 2011 that I started having substantial conversations with enterprise customers about the possibility of moving some workloads to AWS and to public clouds. Around this time I also started hearing traditional enterprise IT vendors talk about AWS, not other traditional vendors, as potentially becoming their biggest competitor. By 2012, I had finally grasped the power and the potential of the public cloud and the “havoc” AWS was wrecking in the IT industry. By early 2013, I was writing about why most users should adopt a“public cloud first” strategy and about the unlikelihood that anyone could challenge Amazon in the public cloud space.

That was also when I started to take a seriously looks at an open source cloud platform project called OpenStack. It looked at the time to be the top contender to be a private and public cloud alternative to AWS. That led me to join Rackspace and their OpenStack team in mid-2013.

Since then, AWS has continued to grow, along with other public clouds like Microsoft Azure and Google Cloud Platform. OpenStack has had missteps along with successes and is trying find it’s place in the IT infrastructure space. I’ve written about that as well, wondering if OpenStack is “stuck between a rock and a hard place.” My employer and a co-founder of OpenStack, Rackspace, has pivoted as well to support both AWS and Azure alongside OpenStack.

Coming into 2017, some of my thoughts about the public cloud, AWS, private cloud and OpenStack have crystallized:

  • Even more than I did back in 2013, I believe that adopting a “public cloud first” strategy should be done by every company of every size.
  • This doesn’t mean that I think companies should move all their workloads to the public cloud. What it does mean is, as I said back in 2013, companies should look to move workloads to the public cloud as their default option and treat on-premises workloads as the exception.
  • Eventually, the majority of workloads will move off-premises as more business recognize that maintaining data centers and on-premises workloads is a undifferentiated heavy lifting that is more of a burden than an asset.
  • Private clouds will have a place for businesses with workloads that must be kept on premises for regulatory or other business reasons. Those reasons will, however, decrease over time.
  • Private clouds will become a platform primarily for telcos and large enterprises and their platform of choice here will be OpenStack.
  • Everybody else will adopt a strategy of moving what they can to the public cloud and keeping the rest running on bare-metal or containers or running on VMware vSphere.
  • Managed private clouds, like the Rackspace OpenStack Private Cloud offering, not distributions will deliver the best ROI for those that choose the private cloud route because it eliminates most of the undifferentiated heavy lifting.
  • Something that could potentially change the equation for on-premises workloads is if the OpenStack project chooses to pivot and to implement VMware vSphere-like functionality. This will provide what most enterprises actually want from OpenStack – “free” open source vSphere which allows them to migrate their legacy workloads from VMware vSphere.
  • If OpenStack decides to go hard after the enterprise, they should drop or refocus the Big Tent. OpenStack has 0% chance of catching the public cloud anyway and it would be better to focus on creating and refining enterprise capabilities and to do so before the public cloud vendors beat them to it.
  • Multi-cloud is real but is not for everyone and not for every use case. It makes sense if you are a mature enterprise with different workloads that might fit better on one cloud over another. For startups, the best option is to invest in one public cloud and innovate rapidly on that cloud.
  • Amazon Web Services will continue to dominate the public cloud market for the foreseeable future even though Azure and Google will make some headway.

That last thought brings us to the reason why I am starting this new blog site – The Learning AWS Blog. I believe we are still in the early days of public cloud adoption and most users are just starting to learn what platforms like AWS can do for them. My goal with this new blog is to provide a destination for those who are new to AWS and seeking to learn. Since I am one of those who still have much to learn about AWS myself, I have found that the best way for me to learn technology is to try and explain what I know and have learned to others. I will continue to maintain my Cloud Architect Musings blog for other technologies such as OpenStack, containers, etc..

Over the coming weeks and months, I will be putting up blog posts, whiteboard videos and demo videos about AWS services. I will look to include every aspect of Amazon Web services, from the basics of Availability Zones and Virtual Private Clouds to automating infrastructure and application deployments using CloudFormation and Elastic Beanstalk, to designing scalable and highly available applications in the Cloud. I will try to provide the most accurate information possible but will always welcome correction and feedback.

In the meantime, I’ve posted some recent blog posts from my Cloud Architect Musings blog that recap announcements from the AWS re:Invent 2016 conference back in November. I hope you will find those useful along with all that I have in store for this blog site in 2017. Stay tuned and thank you for reading and viewing.

AWS re:Invent 2016 Second Keynote: We Are All Transformers

In addition to this post, please also click here to read my AWS re:Invent Tuesday Night Live with James Hamilton recap and here to read my AWS re:Invent 2016 first keynote recap from Wednesday.

After a whirlwind of product announcements from CEO Andy Jassy the previous day, it was time for Werner Vogels, CTO of Amazon Web Services, to take the stage. You can view the keynote in its entirety below. You can also read on to get a digest of Vogel’s keynote along with links to get more information about the announced new services.

Sporting a Transformers t-shirt, Vogels talked about AWS’s role in helping to bring about IT transformation. He very specifically addressed users, particularly developers, about their role as transformers in the places where they worked. AWS can do this, explained Vogels, because they have strived from the very beginning to be the most customer centric IT company on Earth.

Screen Shot 2016-12-01 at 11.35.05 AM.png

To meet their goal of making their customers transformers in their businesses, Vogels talked about three ways that AWS can help create transformers.

Screen Shot 2016-12-01 at 11.51.50 AM.png

In the area of development, Vogels emphasized the importance of code development and testing because that’s where users can experiment and where businesses can be agile.

Screen Shot 2016-12-01 at 11.54.25 AM.png

To help users transform the way they do development and testing, Vogels focused on AWS products, old and new, that help bring about operational excellence, particularly in the areas of preparedness, operations and responsiveness.

Screen Shot 2016-12-01 at 11.59.59 AM.png

In the area of preparing, Vogels talked about the importance of automating as many tasks as possible in order to build reliable, secure and efficient development, test and production environments. A key service to enable automation on AWS is CloudFormation and although no new announcements were made in this area, Vogels took some time to review the new features that have been added to CloudFormation in 2016.

Screen Shot 2016-12-01 at 12.03.25 PM.png

Many customers make use of Chef cookbooks to prepare and to configure their AWS environments. This is in large part because of AWS OpsWorks which is a configuration management service based on Chef Solo. Taking this to the next step, Vogels announced a new AWS OpsWorks for Chef Automate service. This new service provides a user with a fully managed Chef server to removed one more operational burden they had to contend with previously. You can read more about AWS OpsWorks for Chef Automate here.


Moving on to systems management, Vogels announced Amazon EC2 Systems Manager, which is a collection of AWS tools to help with mundane administration tasks such as packaging, installation, patching, inventory, etc.. You can read more about AWS EC2 Systems Manager here.

Screen Shot 2016-12-01 at 12.05.37 PM.png

Transitioning to operating as the next area of operational excellence transformation, Vogels made the argument that code development and continuous integration/continuous deployment is a crucial part of operations. After reviewing the existing services that AWS has to assist users with making the code development process more agile, Vogels announced AWS CodeBuild to go with the existing CodeCommit, CodeDeploy and CodePipeline services.


AWS CodeBuild is a fully managed service that automates building environments using the latest checked-in code and running unit tests again that code. This service streamlines the development process for users and reduces the risk of errors. You can read more about AWS CodeBuild here.


Another key aspect of operating is monitoring. As he had done previously, Vogels reviewed the existing services that help users gain visibility into their environments.

Screen Shot 2016-12-01 at 12.11.13 PM.png

Taking the next step to help users gain deeper insights into how their applications are running, Vogels harkened back to Jassy’s keynote theme of superpowers to introduce AWS X-Ray. Acknowledging the difficulty of debugging distributed systems, AWS released X-Ray to give users the ability to trace requests across their entire application and to map our the relationship between various services in the system. This insight makes it easier for developers to troubleshoot and to improve their applications. You can read more about AWS X-Ray here.


The final area of operational excellence Vogels covered was responding. How can users respond to errors and alarms and do so in an automated fashion that can also escalate issues in a timely fashion when necessary?


One answer from AWS is the new AWS personal Health Dashboard. Based off the existing AWS Service Health Dashboard, this new service provides users with a personalized view of the system health of AWS. The new dashboard will show the performance and availability of services that is being accessed by a user. Users will also receive alerts that are triggered by a degradation in the services that are being leveraged by them and users can write Lambda functions to respond to those events. You can read more about AWS personal Health Dashboard here.


AWS and their customers also have to respond to security issues. Distributed Denial of Service attacks have been the top threat for web applications with many different types of attacks at different layers of the networking stack. Historically, most of these DDoS attacks have tended towards Volumetric and State Exhaustion attacks.

Screen Shot 2016-12-06 at 5.47.50 PM.png

To address these attacks, Vogels announced AWS Shield. This is a managed service that works in conjunction with other AWS services like Elastic Load Balancing and Route 53 to protect user web applications. AWS Shield comes in two flavors – AWS Shield Standard and AWS Shield Advanced. Standard is available to all AWS customers at no extra cost and protects users from 96% of the most common attacks.


AWS Shield Advanced provides additional DDoS mitigation capability for volumetric attacks, intelligent attack detection, and mitigation for attacks at the application & network layers. Users get 24×7 access to the AWS DDoS Response Team (DRT) for custom mitigation during attacks, advanced real-time metrics and reports, and DDoS cost protection to guard against bill spikes in the aftermath of a DDoS attack. You can read more about AWS Shield Standard and AWS Shield Advanced here.


Transitioning away from transforming operational excellence, Vogels moved to transformation though using data as a competitive differentiator. Because of the cloud, Vogels asserted, everyone has access to services such as data warehousing and business intelligence. What will differentiate companies from each will other will be the quality of the data they have and the quality of the analytics they perform on that data.

The first new service announcement that Vogels made in this area was AWS Pinpoint. Pinpoint is a service that helps users run targeted campaigns to improve user engagement. It uses analytics to help define customer target segments, send targeted notifications to that target segment and track how well a particular campaign did. You can read more about AWS Pinpoint here.


Moving on, Vogels argued that 80% of analytics work is not actually analytics but hard work to prepare and to operate an environment where you can actually do useful queries of your data. AWS is on a mission to flip this so 80% of analytics work done by users will actually be analytics.


Vogels argued that AWS already has number of services to address most of the  work that falls into that 80% bucket. To address even more of that 80%, Vogels introduced a new service called AWS Glue. Glue is a data catalog and ETL service that simplifies movement of data between different AWS data stores. It also allows users to automate tasks like data discovery, conversion, mapping and job scheduling. You can read more about AWS Glue here.

Screen Shot 2016-12-07 at 9.29.07 AM.png

By adding AWS Glue, Vogels argued that AWS now has all the pieces required to build the industry’s best modern data architecture.

Screen Shot 2016-12-01 at 1.05.23 PM.png

Another need for users in the is space, said Vogels, is large-scale batch processing which normally requires a great deal of heavy lifting to set up and use. To help here, Vogels announced AWS Batch. Batch is a managed service that lets users do batch processing without having to provision, manage, monitor, or maintain clusters. You can read more about AWS Batch here.


The last area of transformation Vogels addressed took him back to the roots of AWS – Compute. Except of course, “compute” at AWS has grown beyond Elastic Compute and virtual machines. Vogels reminded the audience that AWS compute has now grown to also include containers with Elastic Container Service and Serverless/Function as a Service with Lambda.


Since all the new announcements about compute was made by Jassy in the previous keynote, Vogels focused on the containers and Lambda parts of their compute spectrum. For users of ECS, Vogels previewed a new task placement engine which will give users finer-grain control over scheduling policies.

Screen Shot 2016-12-07 at 2.04.53 PM.png

Beyond this, Vogels acknowledged that customers have requested the flexibility to build their own custom container schedulers to work with ECS or to integrate with existing schedulers such as Docker Swarm, Kubernetes or Mesos. To enable this, Vogels announced that AWS is open sourcing Blox, a collection of open source projects for building container management and orchestration services for ECS.

Screen Shot 2016-12-01 at 1.25.28 PM.png

The first two components of Blox will be a cluster state service for handling event streams that came from ECS and the second component will be a daemon-scheduler that will help launch daemons in container instances. You can read more about Blox here.

Screen Shot 2016-12-01 at 1.25.38 PM.png

Moving on to the last compute area, Vogels talked about serverless/Lambda. Lambda already supported a number of languages and AWS added to that list by adding support for C#.


Vogels then mentioned that one the most frequent requests they receive from uses is the ability to execute tasks at the edge of the AWS content delivery network instead of having to go back to a source further away and incurring unwanted extra latency. To address this request, Vogels announced AWS Lambda@Edge. This new service can inspect HTTP requests and execute Lambda functions at CloudFront edge locations when appropriate. You can read more about AWS Lambda@Edge here.

Screen Shot 2016-12-01 at 1.40.13 PM.png

Finally to coordinate multiple Lambda functions in a simple and reliable manner, Vogels announced AWS Step Functions. This service gives users the ability to visually create a state machine which specifies and executes all the steps of a Lambda application. A state machine defines a set of steps that performs work, makes decisions, and controls progress on Lambda functions. You can read more about AWS Step Function here.

Screen Shot 2016-12-01 at 1.42.28 PM.png

Wrapping up his keynote, Vogels summarized all the product announcements that had been made during his and Jassy’s keynotes.

Screen Shot 2016-12-07 at 4.23.56 PM.png

With that, Vogels ended his keynote with a charge to the audience to use all the tools they have been given to go and transform their businesses.

AWS re:Invent 2016 First Keynote: Andy Jassy Is Your Shazam

In addition to this post, please also click here to read my AWS re:Invent Tuesday Night Live with James Hamilton recap and here to read my AWS re:Invent 2016 second keynote recap from Thursday.


I grew up watching a TV show called Shazam! which was based on a comic I also read by the same name. The main protagonist was a superhero called Captain Marvel, who was given his superpowers by a wizard named Shazam. Captain Marvel used the power of Shazam to fight evil and to help save the human race.

At the first keynote for AWS re:Invent 2016, Andy Jassy, CEO of Amazon Web Services, played the part of the wizard who could give everyone cloudy superpowers as he wrapped the keynote around the theme of superpowers. You can view the keynote in its entirety below. You can also read on to get a digest of Jassey’s keynote along with links to get more information about the announced new services.

To set the table, Jassy started the keynote with a business update before giving what everyone in attendance and tuning in was waiting for – a litany of new AWS features and capabilities.

Screen Shot 2016-11-30 at 11.02.45 AM.png

Amazon Web Services continues to grow at an astounding rate with no let up in sight. It is by far the fastest growing billion dollar enterprise IT company in the world, suggesting that it is a safe choice for enterprises.

Screen Shot 2016-11-30 at 11.05.59 AM.png

And the growth is not just coming from startups anymore but includes a growing stable of enterprise customers.

Screen Shot 2016-11-30 at 11.03.22 AM.png

While the keynote included something for everyone, Jassy clearly had new enterprise customers in mind as he walked through the value proposition for AWS, explained basic AWS services, unveiled new services and directed his ire at Larry Ellison and Oracle. And to frame the rest of his keynote, Jassy assumed his Shazam wizard persona and explained what AWS can do for customers to give them cloudy superpowers.


The first superpower theme to be highlighted was supersonic speed and how AWS enables customers to move more quickly. This not only refers to customers being able to launch thousands of cloud instances in minutes but the ability to go from conception to realization of an idea by taking advantage of all the many services that AWS has to offer.

Screen Shot 2016-11-30 at 11.10.22 AM.png

While AWS already boasts more services than any other cloud provider, Jassy pointed out that their pace of innovation has been increasing to the rate of 1,000+ new features or significant capabilities rolled out in 2016. That equates to an average of 3 new capabilities added per day.

Screen Shot 2016-11-30 at 11.12.53 AM.png

Continuing the focus on supersonic speed, Jassy followed with announcements about new EC2 instance types to add to the already burgeoning compute catalog. In particular, updates to four instance type families, to meet varying compute use cases, were announced.

Screen Shot 2016-12-03 at 5.13.04 PM.png

Two new extra-large instance types were added to the T2 family which doubled and quadrupled respectively the large instance type. T2 instances are suited for general purpose workloads that require occasional bursting and the extra large instances give users more bang for their buck while providing even more burst capacity. You can read about the new T2 instance types here.

Screen Shot 2016-12-03 at 5.04.17 PM.png

For memory intensive workloads, a new R4 instance type was announced which effectively doubled the capabilities of the previous R3 instance type. This memory-optimized instance type is suitable for any workload that benefits most from in-memory processing. You can read more about the new R4 instance type here.

Screen Shot 2016-12-03 at 5.08.00 PM.png

A new I3 instance type was introduced that is optimized for I/O intensive workloads. This new instance type will use SSDs to increase IOPS capabilities by orders of magnitude over the current I2 instance type. The I3 will be ideally suited for transaction oriented workloads such as databases and analytics. You can read more about the new I3 instance type here.

Screen Shot 2016-11-30 at 11.29.30 AM.png

Next up was the new C5 compute-optimized instance type using the new Intel Skylake CPU. The C5 will be suitable for workloads that require CPU-intensive workloads such as machine learning and financial operations requiring fast floating point calculations. You can read more about the new C5 instance type here.

Screen Shot 2016-11-30 at 11.30.31 AM.png

Another area where speed is important are computational workloads that require a Graphic Processing Unit (GPU) to offload processing from the CPU. Jassy announced that AWS is working on a feature called Elastic GPUs For EC2. This will allow GPUs to be attached to any instance type as workload demands require, similar in concept to Elastic Block Storage. You can read more about the Elastic GPUs here.

Screen Shot 2016-11-30 at 11.32.33 AM.png

The last new instance type to be announced was the F1 instance type utilizing customizable FPGAs which will give developers the flexibility to program these instances to meet specific workload demands in a way that could not be done with standard CPUs. You can read more about the new F1 instance type here.

Screen Shot 2016-11-30 at 11.37.06 AM.png

Accelerating how fast users can move goes beyond new hardware and new instance types. There is also the need to simplify complex tasks whenever possible. Cloud providers like Digital Ocean have carved out a strong niche market by specializing in offering no-frills Virtual Private Servers (VPS). A VPS is a low-cost hosted virtual server that is designed to be easy for users to set up and suitable for running applications that do not have high performance requirements.

AWS is taking VPS providers like Digital Ocean head on with their new Amazon Lightsail service. For as little as $5 a month , users can launch new instances in their VPC and do so by walking through minimal configuration steps.

Screen Shot 2016-11-30 at 11.34.41 AM.png

Behind the scenes, Lightsail will create a VPS preconfigured with SSD-based storage, DNS management, and a static IP address. As underscored below, all the steps in the box are performed on behalf of the user. You can read more about Amazon Lightsail here.

Screen Shot 2016-11-30 at 11.35.30 AM.png

Moving on the next superpower that AWS can give users, Jassy talked about x-ray vision and how it can benefit cloud users. The first benefit was mainly a not so subtle dig at Larry Ellison and Oracle and other legacy vendors.

Screen Shot 2016-12-03 at 11.20.12 PM.png

Jassy’s argument was that on AWS, users can run their own tests and benchmarks on true production like environments instead of accepting the word of untrustworthy vendors. It was one of many negative attacks on Oracle during Jassy’s keynote.

Getting back on point, Jassy talked about the benefit for users of being able to perform business analytics on the data they’ve uploaded to AWS as part of the x-ray vision power that AWS gives to them. Jassy then highlighted the breadth of the existing AWS services for doing analytics to help users better understand their customers.

Screen Shot 2016-11-30 at 11.41.43 AM.png

Enhancing this portfolio, Jassy unveiled a new service called Amazon Athena. Athena is a new query service for analyzing stored S3 data using standard SQL. In essence, users can treat their S3 as a data lake and perform queries against unstructured data to unearth actionable intelligence. You can read more about Amazon Athena here.

Screen Shot 2016-11-30 at 11.45.22 AM.png

Another benefit of “x-ray vision” which Jassy presented was the ability for users to see meaning inside their data through artificial intelligence. Jassy pointed out that Amazon, the parent company, has been leveraging artificial intelligence and deep learning for their own businesses.

Screen Shot 2016-12-04 at 9.51.07 PM.png

Naturally, AWS is leveraging the learnings and tools of Amazon to create a suite of new services focused on artificial intelligence called Amazon AI.

Screen Shot 2016-11-30 at 11.52.12 AM.png

The first service in the suite is Amazon Rekognition for image recognition and analysis. This service is powered by deep learning technology developed inside Amazon that is already being used to analyze billions of images daily. Users can leverage Rekognition to create applications for use cases such as visual surveillance or user authentication. You can read more about Amazon Rekognition here.

Screen Shot 2016-11-30 at 11.52.40 AM.png

Moving from image to voice AI, Jassy next introduced Amazon Polly, a service for converting text to speech. Polly initially supports 24 different languages and can speak in 47 different voices. Powered also by deep learning technology created by Amazon, Polly can translate text that may have ambiguous meanings by understanding the context of the text. User can leverage Polly to create applications that require all types of computer generated speech. You can read more about Amazon Polly here.

Screen Shot 2016-11-30 at 11.54.25 AM.png

Rounding out the new AI suite, Jassy introduced Amazon Lex for natural language understand and for voice recognition. Based on the same deep learning technology behind Alexa, which powers the Amazon Echo, users can build Lex based applications such as chatbots or anything that supports conversational engagement between humans and software. You can read more about Amazon Lex here.


Another superpower trumpeted by Jassy was that of flight, which he used as a metaphor for having the freedom to build fast, to understand data better and most importantly, to escape from hostile database vendors. To incentivize users to leave their traditional database vendors, AWS had previously introduced a Database Migration service and the Amazon Aurora MySQL-Compatible database service. As it turned out, enterprises liked Aurora but also wanted support for PostgreSQL. So Jassy took this opportunity to announced a new Amazon Aurora PostgreSQL-Compatible database service.

Screen Shot 2016-11-30 at 12.48.04 PM.png

This new service uses a modified version of the PostgreSQL database that is more scalable and has 2x the performance of the open source version of PostgreSQL but maintains 100% API compatibility. You can read more about PostgreSQL for Aurora here.

Screen Shot 2016-11-30 at 12.48.21 PM.png

The last superpower discussed by Jassy was shape-shifting, which was another metaphor, this time for AWS’ ability to integrate with on-premises infrastructures. To kick off this section of the keynote, Jassy revisited an announcement that had been made previously of a joint service called VMware Cloud on AWS. This service is simply a managed offering, running on AWS, that supports VMware technologies such as vSphere, vSAN and NSX. You can read more about VMware Cloud on AWS here.

Screen Shot 2016-12-05 at 12.00.00 AM.png

Then in perhaps a somewhat tortured attempt to keep to the current theme, Jassy tried to expand the meaning of on-premises infrastructure beyond servers in the data center to sensors and IoT devices.

Screen Shot 2016-12-05 at 12.11.32 AM.png

Making the transition to talking IoT services, Jassy discussed the challenges of running device on the edge of the network in order to collect and to process data from these sensors and growing number of IoT devices.


To help address these challenges, Jassy announced their new AWS Greengrass service which embeds AWS services like Lambda in field devices. Manufacturers can OEM Greengrass for their devices and users can leverage Greengrass to collect date in the field, process the date locally and forward them to the cloud for long-term storage and further processing. You can read more about AWS Greengrass here.


Of course, any discussion about on-premises infrastructure by AWS ultimately leads back to their desire to move all on-premises workloads to what they consider the only true cloud – AWS. So perhaps it’s no surprise that Jassy would wrap up his keynote with two solutions for expediting the migration of data to AWS.

During the last re:Invent in 2015, AWS announced the Snowball which is a 50 TB appliance for import/export of data to and from AWS. As these Snowball appliances have been put to use, customers have expressed a desire for additional capabilities such as local processing of data on the appliance. To facilitate these new capabilities, Jassy announced the new Amazon Snowball Edge.

Screen Shot 2016-11-30 at 1.14.04 PM.png

The Snowball Edge adds more connectivity, doubles the storage capacity, enables clustering of two appliances, adds new storage endpoints that can be accessed from existing S3 and NFS clients and adds Lambda-powered local processing. You can read more about the AWS Snowball Edge here.

Screen Shot 2016-11-30 at 1.14.31 PM.png

Going back to the enterprise and rounding out the keynote, Jassy asked the question, “What about for Exabytes (of data)?” The answer, Jassy proposed, is a bigger box. Then in a demonstration of showmanship worthy of any legacy vendor, out came the new Amazon Snowmobile.

Screen Shot 2016-12-05 at 12.54.36 AM.png

The proposition of the Snowmobile is very simple. Enterprises will be able to move 100 PBs of data at a time so that an exabyte-scale data transfer that would take ~26 years to do over a 10 Gbps dedicated connection can be completed in ~6 months using Snowmobiles. You can read more about the AWS Snowmobile here.

Screen Shot 2016-12-05 at 12.55.41 AM.png

The spectacle of the Snowmobile being driven on stage proved to be an appropriate capper to the morning keynote with Andy Jassy’s turn as the superpower-giving wizard, Shazam.

AWS re:Invent 2016 Tuesday Night Live: “Don’t Try This At Home”


This year’s AWS re:Invent was bigger than ever with ~32,000 attendees. A highlight at the start of each re:Invent is a presentation by James Hamilton, VP and Distinguished Engineer at Amazon Web Services, affectionately called “Tuesday Night Live at AWS re:Invent.” You can view the recording of Hamilton’s ode to data center geeks everywhere below.

I’ve also provided a recap of his talk in this blog post If you want a short digest of the presentation. And if you are interested, you can also read my recap of the first keynote here and my recap of the second keynote here.

This session is typically an opportunity for Hamilton to give the public a behind the scenes look at some of the inner workings of AWS and to show off their technical innovation chops. That was definitely front and center with updates on things like Amazon’s network architecture and transition to renewable energy in their data centers. However, Hamilton was also clearly targeting his message to enterprises and Fortune 500 companies, specifically those who are considering AWS or are just dipping their toes in the world’s largest cloud. The not so subtle message Hamilton delivered to enterprises were, “You and your vendor partners can’t even come close to what we can build and maintain; so stop trying to build your own data centers and come to AWS.”

Hamilton started on this theme immediately in his talk with a comparison that shows the scale of AWS.

Screen Shot 2016-11-29 at 11.09.57 PM.png

The implication is that AWS has and is adding enough capacity to handle the workload of any number of Fortune 500 companies and then some. The pitch to enterprise are that by moving to AWS, they can relief themselves of the undifferentiated heavy lifting of data center management and focus on creating business innovation. AWs promises customers that they can go fast, scale their resources up or down as needed and do it for “a whole lot of heck cheaper” than if they did it themselves.

To further underscore their scale, Hamilton then went through a series of slides that showed the high-level architecture for AWS.

Screen Shot 2016-11-29 at 11.15.04 PM.png

AWS currently has 14 Regions worldwide and plan to add 4 more.  A Region is a distinct geographic location where AWS has data center presence. Within each Region are mulitpel Availability Zones which are discussed later. A growing number of AWS Regions across the global are critical for enterprise adoption because it can address issues such as latency and data sovereignty.

Screen Shot 2016-11-29 at 11.17.33 PM.png

Connecting all these Regions is the Amazon Global Network with redundant 100GbE links that stretch across the globe. Building something similar for on-premises infrastructures would be very costly for any single enterprise.

Screen Shot 2016-12-01 at 8.25.46 AM.png

Drilling down into the Regions themselves, Hamilton tried to convey the scale and capacity of AWS. Each Region is composed of 2 or more Availability Zones and there are currently 38 of those  across those 14 Regions. An Availability Zone or AZ consists of one or more data centers, each with redundant power, networking and connectivity and hosted in separate facilities. The concept of AZs are important to note because applications can be architected for high availability by deploying them across multiple AZs.

Screen Shot 2016-12-01 at 8.28.17 AM.png

An Availability Zone can span multiple data centers, with an AZ potentially spanning across as many as 8 data centers. Some AZs host as many as 300,000+ servers. This gives AWS a total footprint that is larger than the next 14 public cloud providers combined. It also demonstrates their ability to scale to meet the requirements of any enterprise.

At this level of scale, AWS claims they have no choice but to create custom technologies for their data centers. This runs the gamut from compute to networking to storage.

Screen Shot 2016-12-01 at 8.30.07 AM.png

Screen Shot 2016-12-01 at 8.32.39 AM.png

Screen Shot 2016-12-01 at 8.33.37 AM.png

This is the only way AWS can gain the cost efficiencies, performance and scalability needed to successfully run their cloud. It  It also allow AWS, claims Hamilton, to build more reliable data centers since they don’t have to rely on typical enterprise vendors who build overly complex products that are filled with bloat. Again, the point being made is that AWS is on the leading edge of technological innovation and data center management. So enterprises should not try this for themselves but rely on the experience and expertise of Amazon Web Services.

Having proven, Hamilton believes, AWS’s superiority in managing data centers at scale, we move on to one of the elephants in the room when it comes to traditional enterprises moving workloads to the cloud – “What about the mainframe?” Many enterprises still run mission critical applications on their mainframes and that is a workload you don’t just move to any cloud, including AWS.

To address this, Hamilton introduced Navin Budhiraja, Senior VP and CTO of Infosys, who talked about their new Mainframe Migration to AWS service offering.

Screen Shot 2016-12-01 at 8.36.07 AM.png

This offering by Infosys helps customer solve three problem, claims Budhiraja, that enterprises are facing if they do not migrate off the mainframe to the cloud – escalation of costs, lack of agility and shortage of skills. Clearly, this was another way to tell enterprises they have no excuse for not migrating to the public cloud.

Going back to the theme of scale, Hamilton then introduced Tom Soderstrom, CTO at the NASA Jet Propulsion Lab. Soderstrom talked about their mission to answer the big questions of the Universe through initiatives such as the Deep Space Network.

Screen Shot 2016-11-30 at 12.06.04 AM.png

Soderstrom explained that only a public cloud platform like AWS could satisfy the compute and storage requirements of the Jet Propulsion Lab. This is a scale that few enterprises, even the largest, would need or could build.

Moving on, Hamilton introduced Dr. Matt Wood, General Manager of Product Strategy at AWS. Dr. Wood talked about Machine Learning and its application for Artificial Intelligence and Deep Learning. He iterated that to address the challenge of Deep Learning requires scalable infrastructure.

Screen Shot 2016-12-01 at 8.38.59 AM.png

The evening ended with an update by Hamilton on their progress towards becoming a 100% renewable energy business. The progress made by AWS in this area is noteworthy.

Screen Shot 2016-12-01 at 8.40.16 AM.png

With that last update, Tuesday Night Live at AWS re:Invent came to a close. You can read my recap of the first keynote here and my recap of the second keynote here.