stata - Create 3-way percentages table -


i have 3-way table displaying column or row percentages using 3 categorical variables. command below gives counts cannot find how percentages instead.

sysuse nlsw88  table married race collgrad, col  --------------------------------------------------------------------           |                college graduate , race                           | ---- not college grad ----    ------ college grad ------   married | white  black  other  total    white  black  other  total ----------+---------------------------------------------------------    single |   355    256      5    616      132     53      3    188   married |   862    224     12  1,098      288     50      6    344 -------------------------------------------------------------------- 

how can percentages?

this answer show miscellany of tricks. downside don't know easy way ask. upside these tricks easy understand , useful.

let's use example, excellent purpose.

. sysuse nlsw88, clear  (nlsw, 1988 extract) 

tip #1 can calculate percent variable yourself. focus on % single. in data set married binary, won't show complementary percent. once have calculated it, can (a) rely on fact constant within groups used define (b) tabulate directly. find tabdisp underrated users. it's billed programmer's command, not difficult use @ all. tabdisp lets set display format on fly; no harm , might useful other commands assign 1 directly using format.

. egen pcsingle = mean(100 * (1 - married)), by(collgrad race)  . tabdisp collgrad race, c(pcsingle) format(%2.1f)  --------------------------------------                  |        race         college graduate | white  black  other -----------------+-------------------- not college grad |  29.2   53.3   29.4     college grad |  31.4   51.5   33.3 --------------------------------------  . format pcsingle %2.1f  

tip #2 user-written command groups offers different flexibility. groups can installed ssc (strictly, must installed before can use it). it's wrapper various kinds of tables, using list display engine.

. * installation once  . ssc inst groups   . groups collgrad race pcsingle     +-------------------------------------------------------+   |         collgrad    race   pcsingle   freq.   percent |   |-------------------------------------------------------|   | not college grad   white       29.2    1217     54.19 |   | not college grad   black       53.3     480     21.37 |   | not college grad   other       29.4      17      0.76 |   |     college grad   white       31.4     420     18.70 |   |     college grad   black       51.5     103      4.59 |   |-------------------------------------------------------|   |     college grad   other       33.3       9      0.40 |   +-------------------------------------------------------+ 

we can improve on that. can set better header text using characteristics. (in practice, these can less constrained variable names need shorter variable labels.) can use separators calling standard list options.

. char pcsingle[varname] "% single"  . char collgrad[varname] "college?"  . groups collgrad race pcsingle , subvarname sepby(collgrad)     +-------------------------------------------------------+   |         college?    race   % single   freq.   percent |   |-------------------------------------------------------|   | not college grad   white       29.2    1217     54.19 |   | not college grad   black       53.3     480     21.37 |   | not college grad   other       29.4      17      0.76 |   |-------------------------------------------------------|   |     college grad   white       31.4     420     18.70 |   |     college grad   black       51.5     103      4.59 |   |     college grad   other       33.3       9      0.40 |   +-------------------------------------------------------+ 

tip #3 wire display formats variable making string equivalent. don't illustrate fully, use when want combine display of counts numerical results decimal places in tabdisp. format(%2.1f) , format(%3.2f) might fine variables (and incidentally important detail number of decimal places) lead display of count of 42 42.0 or 42.00, pretty silly. format() option of tabdisp not reach string , change contents; doesn't know string variable contains or came from. so, strings shown tabdisp come, want.

. gen s_pcsingle = string(pcsingle, "%2.1f")   . char s_pcsingle[varname] "% single" 

groups has option save tabulated fresh dataset.

tip #4 have total category, temporarily double data. clone of original relabelled total category. may need calculations, nothing there amounts rocket science: smart high school student figure out. here concrete example line-by-line study beats lengthy explanations.

. preserve   . local np1 = _n + 1   . expand 2  (2,246 observations created)  . replace race = 4 in `np1'/l  (2,246 real changes made)  . label def racelbl 4 "total", modify    . drop pcsingle   . egen pcsingle = mean(100 * (1 - married)), by(collgrad race)  . char pcsingle[varname] "% single"  . format pcsingle %2.1f   . gen istotal = race == 4   . bysort collgrad istotal: gen total = _n   . * percents of global total, need correct doubling      . scalar alltotal = _n/2   . * table shows percents college & race | collgrad , collgrad | total  . bysort collgrad race : gen pc = 100 * cond(istotal, total/alltotal, _n/total)   . format pc %2.1f . char pc[varname] "percent"   . groups collgrad race pcsingle pc , show(f) subvarname sepby(collgrad istotal)     +-------------------------------------------------------+   |         college?    race   % single   percent   freq. |   |-------------------------------------------------------|   | not college grad   white       29.2      71.0    1217 |   | not college grad   black       53.3      28.0     480 |   | not college grad   other       29.4       1.0      17 |   |-------------------------------------------------------|   | not college grad   total       35.9      76.3    1714 |   |-------------------------------------------------------|   |     college grad   white       31.4      78.9     420 |   |     college grad   black       51.5      19.4     103 |   |     college grad   other       33.3       1.7       9 |   |-------------------------------------------------------|   |     college grad   total       35.3      23.7     532 |   +-------------------------------------------------------+ 

note trick of using variable not shown explicitly add separator lines.


Comments

Popular posts from this blog

inversion of control - Autofac named registration constructor injection -

verilog - Systemverilog dynamic casting issues -

ios - Change Storyboard View using Seague -