I'm playing around with hospital waiting lists trying to find out what factors affect waiting times.
Using the Pearson Correlation of my features (which are the ethnic make up of a hospital waiting list and the time spent waiting), the data looks like this:
Pearson Correlation Heatmap: Ethnicity and waiting time |
Pearson Correlation Heatmap: normalized rows |
Well that was silly. Of course there will be a negative correlation between ethnicities as the total needs to sum to 1.0.
Pearson correlation: ethnicities vs waiting list time |
coefficient category p-value
2.9759572500367764 White 0.011753367452440378
0.607824511239789 Black 0.6505335882518613
1.521480345096513 Other 0.2828384545133975
1.2063242152220122 Mixed 0.4440061099633672
14.909193776961727 intercept 0.0
Now this is more interesting. The p-values are still a little too high to draw conclusions about two groups but it's starting to look like the waiting list size is lower if you are Asian or Mixed ethnicity.
Adding other columns makes the ethnicities at least look even more certain although the p-value for these new categories - including socioeconomic and age -were themselves not particularly conclusive.
coefficient category p-value
-4.309164982942572 Mixed 0.0004849926653480718
-2.3795206765105696 Black 0.027868866311572482
-2.2066035364090726 Other 0.008510096871574113
-2.9196528536463315 Asian 2.83433623644580e-05
20.342211763968347 intercept 0.0
No comments:
Post a Comment