database design - Is it a good idea to have many keyspaces and potentially thousands of tables in Cassandra? -


so, i've been using cassandra while , architecture of database designed in way unusual me. fact don't have enough knowledge decide if that's design or not i'm new whole big data thing.

here's simplification:

  • we have vendors
  • each vendor have clients
  • for each vendor, create own keyspace in cassandra.
  • for each client of vendor, create approximately 12-15 tables in vendor's keyspace. clientid_tablename.
  • tables created dynamically when client created. slow , i'm afraid cassandra fail propagate schema when under load of other operations.
  • all table have same schema, there no special modelling given client.
  • due nature of our data, around 5 of these tables can potentially have millions, if not billions, of rows.

because of distributed nature of cassandra, never think such "manual" division of data needed or beneficial.

this single application have dozens of keyspaces , potentially thousands of tables per keyspace. won't impact performance negatively?

the impression given design allows spread data more evenly, causing less performance impact when searching within single table. didn't make sense me, didn't have arguments counter experience cassandra , so-called design big data is, @ best, limited. benefit can think of have different keyspace settings per vendor. don't think trumps of added complexity.

in short, idea?

first of all, when moving rdbms cassandra, have re-design erd, , in cases, moving standard, , normalized schema bad decision. right trying move existing schema cassandra.

you have table creation per vendor etc. workflow. need understand why working way, , if need in cassandra @ all. in general can have many tables, , many keyspaces (there limits, high) not fit cassandra modeling @ all.

in cassandra, should build tables based on queries , not entity,object,relation etc... data duplication not considered problem, trade off between performance , storage needed.

i suggest take course data modeling in cassandra datastax. it's great course, , it's totally free::

https://academy.datastax.com/courses


Comments

Popular posts from this blog

commonjs - How to write a typescript definition file for a node module that exports a function? -

openid - Okta: Failed to get authorization code through API call -

thorough guide for profiling racket code -