sql server - How can I optimize this query which works with 24M row table? -
i have table 24 milion rows.
i want run query:
select r1.userid, r2.userid, sum(r1.rate * r2.rate) sum dbo.ratings r1 join dbo.ratings r2 on r1.movieid = r2.movieid r1.userid <= r2.userid group r1.userid, r2.userid as tested, took 24 hours produce 0.02 percent of final result.
how can speed up?
here definition of table:
create table [dbo].[ratings]( [userid] [int] not null, [movieid] [int] not null, [rate] [real] not null, primary key clustered ( [userid] asc, [movieid] asc )with (pad_index = off, statistics_norecompute = off, ignore_dup_key = off, allow_row_locks = on, allow_page_locks = on) on [primary] ) on [primary] create nonclustered index [ix_ratingsmovies] on [dbo].[ratings] ( [movieid] asc )with (pad_index = off, statistics_norecompute = off, sort_in_tempdb = off, drop_existing = off, online = off, allow_row_locks = on, allow_page_locks = on) on [primary] create nonclustered index [ix_ratingsusers] on [dbo].[ratings] ( [userid] asc )with (pad_index = off, statistics_norecompute = off, sort_in_tempdb = off, drop_existing = off, online = off, allow_row_locks = on, allow_page_locks = on) on [primary] here execution plan:
the workaround suggested create "reverse" index:
create index ix_ratings_reverse on ratings(movieid, userid) include(rate); and force sql server use it:
select r1.userid, r2.userid, sum(r1.rate * r2.rate) sum dbo.ratings r1 join dbo.ratings r2 (index(ix_ratings_reverse)) on r1.movieid = r2.movieid r1.userid <= r2.userid group r1.userid, r2.userid 
Comments
Post a Comment