问题描述:

How to choose the primary key in DynamoDB if the item can only be uniquely identified by three or more attributes? Or this is not the correct way to use NOSQL database?

网友答案:

Generally, if your items are uniquely identified by three or more attributes you can concatenate the attribute values and form a composite string key that you can use as a hash key in the Dynamo tbale.

You can de-duplicate the attribes from the hash key into separate attributes on the item if you need to create indexes on them, or if you need to use them in conditional expressions.

The rules for relational databases normal form don't necessarily apply to NoSQL databases and, in fact, a denormalized schema is usually preferred.

To expand the concept, it is typical (and usually desirable) when designing relational database schemas to use normalized form. One of the normalized forms dictates that you should not duplicate data that represents the same "thing" in your database.

I'm going to use an example that has just two parts to the key but you can extend it further.

Let's say you're designing a table that contains geographical information in the united states. In US, a Zip Code consists of 5 digits and an additional 4 digits that may subdivide the region.

In a relational database you might use the following schema:

  Zip    |   Plus4   |  CityName     |  Population
---------+-----------+---------------+---------------  
 CHAR(5) |  CHAR(4)  | NVARCHAR(100) |  INTEGER

With a composite primary key of Zip, Plus4

This is perfect because the combination of Zip and Plus4 is guaranteed to be unique and you can answer any query against this table regardless of whether you have both the Zip and the additional Plus4 code, or just the Zip. And you can also get all the Plus4 codes for a Zip code rather easily.

If you wanted to store the same information in Dynamo you might create a hash key called "ZipPlus4" that is of type String and which consists of the Zip code concatenated with the Plus4 code (ie. 60210-4598) and then also store two more attributes on the item, one of which is the Zip code by itself and the other which is the Plus4 by itself. So an item in your table might have the following attributes:

  ZipPlus4 | Zip     |  Plus4   |  CityName   |  Population
-----------+---------+----------+-------------+---------------  
  String   | String  |  String  |  String     |  Number

The ZipPlus4 above would be the Hash key for the table.

Note that in the example above you could get away with having a hash key of "Zip" and a range key of "Plus4" but as you saw, when you have more than 2 parts you need something different.

相关阅读:
Top