DynamoDB - a serverless database

So far so good. Now that we know how to discover if a problem occurs, we can continue.

After creating the API Gateway endpoints, the "Sauf Pompiers" SARL folks want us to actually store and list the receipts into the database. Which database and how do we do that with an AWS Lambda?

You can see what we did so far in the following diagram.

Now we need a database to fulfill what we promised. But how do we store state within a serverless environment? We can connect each of our Lambda's to the current database we use (on-premise / RDS) but what seems to be the problem here?

It can scale from 0 to 1000 concurrent connections, from day one.

AWS Lambda, as a Function as a Service, tends to scale independently and massively, by default up to 1000 concurrent invocations per one Lambda. And how many DB connections do we have usually on a project?

Our databases should be able to scale as well. But this contradicts our Serverless Model - Pay per Use. Let's take an example that we had a full-blown serverless database, a Serverless RDS. That would also be against our model.

Do you know why?

Take some time to think about the answer.

Want to know the answer?

It's because we are scaling unneeded infra for the data we don't need!

Imagine you have a 100GB database. To make it scalable, you would need to instantly load up or cache such a quantity of data, or some other shenanings. Of course, that is possible, if we hide away all of such mechanics as well, as you might have heard about Serverless Aurora, but regardless we need a different approach.

Now that you know why do we need a different approach, we need to ask ourselves what would be an equivalent in the concept of a Function as a Service.

The answer is pretty obvious.

A data table as a service.

Do we know such a service?

And as some of you may have already guessed, its DynamoDB.

DynamoDB

As you can read from the AWS website it is:

Amazon DynamoDB is a key-value and document database, fully managed, multiregion, ultimaster, durable database with built-in security, backup and restore, and in-memory caching for internet-scale applications. DynamoDB can handle more than 10 trillion requests per day and can support peaks of more than 20 million requests per second.

Seems good, but how do we use it?

Here is an example with CloudFormation:

DynamoDBTable:
  Type: AWS::DynamoDB::Table
  Properties:
    TableName: "YourTableName"
    AttributeDefinitions:
      - AttributeName: "YourTableNameId"
        AttributeType: S
    KeySchema:
      - AttributeName: "YourTableNameId"
        KeyType: HASH
    BillingMode: PAY_PER_REQUEST
    SSESpecification:
      SSEEnabled: True

What are all these properties? You can guess some, but some are quite intruiging.

AttributeDefinitions, simply a list of attributes
KeySchema, the keys schema for all the database keys
BillingMode, configures how you want to pay for DynamoDB, two available options are PROVISIONED and PAY_PER_REQUEST.
SSESpecification, one of the most important, enables Server Side Encryption.

But what is most interesting is actually the Type. We are actually creating AWS::DynamoDB::Table - a table, not a whole database.

Most importantly, it is a NoSQL database, with support of Primary Indexes, Secondary Indexes, and Sort Indexes.

Indexes

What are these indexes and how do work?

Amazon DynamoDB provides fast access to items in a table by specifying primary key values. But if you want to fetch the data of attributes other than the primary key, indexing comes into the picture. Most of you know that, so let's jump into the DynamoDB details.

Now, what is a primary, and what is a secondary index?

Answer

A primary index is an index on a set of fields that includes the unique primary key and is guaranteed not to contain duplicates. In contrast, a secondary index is an index that is not a primary index and may have duplicates.

AWS DynamoDB offers both Local and Global Secondary Indexes.

Global Secondary Index − This index includes a partition key and sort key, which may differ from the source table. It uses the label “global” due to the capability of queries/scans on the index to span all table data, and over all partitions.

Local Secondary Index − This index shares a partition key with the table, but uses a different sort key. Its “local” nature results from all of its partitions scoping to a table partition with identical partition key value.

Now we're ready to jump into actually get our Lambda to interact with DynamoDB.

How to use a DynamoDB with AWS Lambda

To get our Lambda code to get /store data from a DynamoDB Table, we need to use the AWS SDK. Here are some example AWS SDK API links for JavaScript and Java for DynamoDB.

An example JavaScript code for getting data from a DynamoDB table


await dynamoDb.scan( {
    TableName: TABLE_NAME
  }).promise();

And here is the example of storing data into a DynamoDB table:

 await dynamoDb.put({
    TableName: TABLE_NAME,
    Item: {
      someTableId: "value"
      attr1: "value"
    }
  }).promise();

This looks simple enought, right?

But unfortunately, if you remember, Lambda functions are isolated and permissionless. And in this case they really don't have any permissions.

To add those you need to give a Lambda a Policy with a Statement, like this:

  Properties:
  Policies:
    - Version: '2012-10-17'
      Statement:
        - Effect: Allow
          Action:
            - dynamodb:PutItem
          Resource: !GetAtt "YourDynamoTable.Arn"

This policy allows the function to put data into the YourDynamoTable DynamoDB table, based on its ARN - which is Amazon Resource Name.

Note: You can add more than one Action Item, but not more than one Resource. You will need to create a separate Policy Statement for each Resource. Meaning that one Lambda is able to access more than one DynamoDB table. The Resource parameter requires a ARN.

Task

This covers the basics, but for the Task you'll have a slightly elevated challenge.

This task is going to e Steps are:

Create a DynamoDB table expenses.
Add the permissions to the Lambda to be able to save to the database but by using AWS SAM Policy Templates.
Implement the necessary code to save the receipt to the DynamoDB table from the enter-expense Lambda function.
(Bonus) Enable the list-expenses Lambda to retrieve all the data from the database.
(Bonus) Enable the update-expense Lambda to update data from the database.

Next Lesson

When you finish this, feel free to go ahead to Testing!

06. DynamoDB a serverless database