I'm playing around with hospital waiting lists trying to find out what factors affect waiting times.
Using the Pearson Correlation of my features (which are the ethnic make up of a hospital waiting list and the time spent waiting), the data looks like this:
|  | 
| Pearson Correlation Heatmap: Ethnicity and waiting time | 
|  | 
| Pearson Correlation Heatmap: normalized rows | 
Well that was silly. Of course there will be a negative correlation between ethnicities as the total needs to sum to 1.0.
|  | 
| Pearson correlation: ethnicities vs waiting list time | 
coefficient             category   p-value
2.9759572500367764	White	   0.011753367452440378
0.607824511239789	Black	   0.6505335882518613
1.521480345096513	Other	   0.2828384545133975
1.2063242152220122	Mixed	   0.4440061099633672
14.909193776961727	intercept  0.0
Now this is more interesting. The p-values are still a little too high to draw conclusions about two groups but it's starting to look like the waiting list size is lower if you are Asian or Mixed ethnicity.
Adding other columns makes the ethnicities at least look even more certain although the p-value for these new categories - including socioeconomic and age -were themselves not particularly conclusive.
coefficient             category  p-value
-4.309164982942572	Mixed	  0.0004849926653480718
-2.3795206765105696	Black	  0.027868866311572482
-2.2066035364090726	Other	  0.008510096871574113
-2.9196528536463315	Asian	  2.83433623644580e-05
20.342211763968347	intercept 0.0
 
 
No comments:
Post a Comment