BigQuery table performance loss with TABLE_QUERY -


having surprising performance hit when querying multiple tables, vs. 1 big one. scenario:

we have simple web analytics tool based on bigquery. track basic events individual sites. month, pumped data 1 big table. now, breaking data partitions site , month.

so big table [events.all]

now have, say, [events.events_2014_06_siteid]

querying individual tables group is faster, , processing less data. querying our entire dataset much, slower on simple queries. , our new dataset 1 day old, whereas big table 30 days old, it's slower despite querying far less data.

for example:

select count(et) [events.all] et='re' --> completed in 3.2s, processing 79mb of data. table has 21,048,979 rows.

select count(et) ( table_query(events, 'table_id contains "events_2014_"') ) et='re' --> completed in 44.2s, processing 1.8mb of data. put together, these tables have 492,264 rows.

how come occurs, , there way resolve big disparity?


Comments

Popular posts from this blog

commonjs - How to write a typescript definition file for a node module that exports a function? -

openid - Okta: Failed to get authorization code through API call -

thorough guide for profiling racket code -