Attendance and Inclusion Services and Interventions

Author

Giles Robinson

Published

July 9, 2024

Setup

1 Introduction

A programme of analytical work was began by the SCC BI team in September 2023, with the aims of understanding the key drivers of school absences in Sheffield and the reach and effectiveness of existing services and interventions. The first of those two requirements is written up the report Attendance in Sheffield Schools. This report addresses the second requirement.

A note on terminology

The terms services and interventions are generally used interchangeably. Some of the things we’re evaluating here might be better classed as events. Even so, in each case we’re treating them the same, using the data to evaluate the impact of on attendance.

1.1 Data sources & processing

This is only a brief overview of data sources & processing used in this report. Please contact the BI team giles.robinson@sheffield.gov.uk if you require further detail.

Attendance data is held in the PAS Oscar database tables
Involvements are held in PAS Oscar database
Child social care episodes are retrieved from Oscar database but are recording on LiquidLogic children’s system.
Other services and interventions which are not stored as involvements (school attendance orders, parenting programmes, managed moves) are also retrieved from the Oscar schemas.
Economic deprivation information is retrieved from local indicators data
Demographics are from the Oscar database
Special Educational Needs data is from the school census
School information is from the Oscar database

A separate data model R script retrieves the data listed above and transforms it into clean processed data for analysis in this report script. Crucially for this analysis, attendance data is categorised by DoE attendance codes and calculated as % of available sessions attended and these metrics, and mapped onto the involvements data according to involvement start dates. Average ‘before and after’ attendance metrics are then calculated. This creates involvement summary which can be generated for any demographic group for which we have data.

1.2 Analytical approach

Measures of effectiveness & finding suitable comparators

In this report our aim is to evaluate how effective different things are at increasing school attendance and reducing exclusion rates - though the primary focus is attendance. There are a few difficulties to this:

Services are allocated according to need, and not on a random basis, so we often do not have a “control group”
Attendance tends to reduce over time as children progress through secondary school - so our background rate is (in older children at least) always reducing, and we shouldn’t necessarily be surprised to see that our interventions are associated with a reduction in attendance, even if the stated aim is to increase attendance

To tackle this we’ll take a few different approaches, explained here with example plots:

Average attendance before & after

The ideal approach is to compare a baseline period (prior to intervention) to a test period (following intervention) for the same children. We can then see differences in overall attendance and coded absence reasons, and we can see how these differ for different groups.

This shows the overall trajectory of changing attendance, and how this is affected by our involvements - though it’s important to remember that the points on these charts are just the overall average, behind which lie a lot of variation.

For example, this is the plot for Attendance Advice. We see attendance declining prior to involvement and improving afterward.

Important

The pattern seen here is repeated for many of the interventions covered in this report. The overall trend here changes direction, but attendance levels do not return back to what they were prior to involvement. Moreover, if we add up all the attendance on the left hand side of the chart above and compare it to that on the right, we see a net negative change. Attendance advice is associated with a decrease in attendance levels.

Categorised attendance before & after

This view looks at the same indexed time periods as the average attendance before & after plot shown above, but here children are categorised according to attendance brackets. For each time period (again, relative to the involvement start or closure date) the % of children in each attendance category is calculated: 0 - 50% 50 - 80% 80 - 90% 90 - 100% Also included here are several important other categories: “pre-school” - children who have not yet reached school age at the time period in question “leaving age” - those who have completed Y11 and left school at the time period in question “COVID” - half term periods that fell during COVID lockdowns, the attendance data for which is incomplete or unreliable “out of scope” - time periods that have not elapsed yet or for which data is not yet available. These are categorised but excluded from the % calculations used in the plots.

This can reveal patterns of change in the distribution of attendance that lies behind the averages. Here’s an example categorised stacked bar plot, in this case for Attendance Advice

Coded absence reasons

Comparing levels of difference coded absence reasons before & after involvement may reveal changes in patterns of behaviour. We can do this at the level of the year prior and year post involvement, or at the level of the half term prior and post involvements.When looking at the year prior & post we restrict the data to only children who have at least 3 half terms’ worth of data available in each of those one year periods.

As an example, here is the coded absence before & after plot for the ‘Current EHCP’ involvement:

With & without analysis

Comparing the attendance levels (and coded absence reasons) of groups who had the intervention with those who did not. This is more problematic because finding suitable comparator groups is difficult, but useful for interventions where no prior period data is available - such as the readiness for school team, or 0-5 SEND team.

Here’s an example ‘with & without’ plot, in this case plotting attendance from year 1 through to year 6 for those with and without the Inclusion & Attendance Y4 team involvement:

1.3 Random sampling

To contextualise the above, we can compare the outputs of analysis as described above with a similar analysis for a randomly selected children with no involvement. Before we get into the analysis of the actual involvements, it’s worth looking at a couple of these.

Firstly, a random sample average trajectory plot. Here we’ve taken 200 random children who consistently hit below 75% (for 3 or more half terms in a row), since 2021, and who did not have any involvements.

These random samples often show better apparent recovery than many of the involvements. The best explanation for this is regression to the mean, the idea that when conditions are in a more unfavourable state, they will often revert back to a less unfavourable state, even in the absence of any intervention. This can lead, for example, to people concluding that the medicine that they began taking when they were feeling particularly ill was the cause of their recovery, when in fact they would have recovered naturally without the medicine.

Next we’ll look at the categorised attendance of a random sample, before & after a random date. When evaluating involvments we will typically remove the pre-school, COVID, leaving age and out of scope categores, but they’re included here. The profile of these plots is usually very different for children with involvements.

Finally we’ll look at an example codified reasons of a random sample. These four plots each show the year on year change for a random sample of 200 children, based on a randomly chosen date since 2021.

The takeaway from these 4 charts is that year on year increase in ‘no reason’ absences of about 1% is to be expected in many cases, as are small variations in the other coded reasons. Large chages beyond this are likely behaviour of the cohort receiving the involvment and/or a result of the involvement itself.

2 Services & Interventions

The rest of this report details the findings on effectiveness, broadly following the methodology given above. We cover the following services, interventions & events, and aside from the child social care episodes (CIN/CPP/CLA) which are taken together, we’ll cover in order of decreasing volumes of activity (based on a count of children receiving each during 2024).

Involvement	count
CIN	2971
MAST	2900
Ed. Psych.	922
Attendance Advice	843
Reduced timetable	788
Current EHCP	758
Autism team	648
Inclusion Advice	541
EHE	540
EY Inclusion	537
Exclusion Concern	481
PA Cohort Tracking	406
Consultation	401
Progressions Team	370
CPP	306
penalty notice warning letter	287
SA Cohort Tracking	221
CLA	220
Parenting	218
Other	154
I&A - Y9	112
HI team	110
I&A - Complex SEND	109
Non PIP/SIP - Think for the Future	107
Non PIP/SIP - Theraputic Outreach	101
SIP - Theraputic Outreach	77
Secondary Inclusion Panel	77
I&A - Vulnerable Learner	60

Involvement	count
PIP - Theraputic Outreach	59
Primary Inclusion Panel	59
Portage	58
Rowan outreach	53
VI team	53
UCAN	42
I&A - School Readiness	35
I&A - Y4	32
S437- School Att Order EHE	32
PNOR 50+	31
S437 - School Att Order CME	23
School Att Order Breach	23
I&A - Reintegration	13
GP Protocol	10
Nurture - BLT KS1	9
Nurture - BLT KS3	7
Nurture - Bumblebee	6
Nurture - BLT KS2	5
Nurture - Step Out	5
Exclusions - Managed Move Involvement	4
Managed Move	4
SIP Screening	4
I&A - Traveller Education	3
PIP - Think for the Future	3
Nurture - The Hive	2
EHE Meeting	1
Nurture -	1
PIP Screening	1

2.1 Family Intervention Service

Family intervention Service (FIS), formerly known as MAST.

This service is targetted at families rather than children, but in our data model we mapped FIS activity to the records of about 3000 children per year:

Immediately prior to FIS involvement we see a large drop in average attendance - perhaps this reflects some family crisis that prompts the services involvement, and also affects school attendance? The involvement is associated with an immediate increase in attendance, and a change in the direction of travel, with attendance steadily improving term-on-term once the team is involved. Although the overall net year on year change is still negative - attendance does not recover to prior levels.

Exclusion rates remain high for children involved with FIS:

The categorised reasons also show the immediate effect of FIS involvement, with an increase in the highest attendance category, and a drop in the lowest bracket:

Since the significant change following FIS involvement happens at the half term level, we’ll look at coded reasons at the half term level. Here we see a decrease across all coded absences, especially in illness rates.

Finally, we look at how the half-term to half-term shifts in attendance vary by some chosen characteristics:

Younger children see a greater net improvement than older children; those going into Y7 and Y11 see a net reduction
girls see a bigger improvement than boys
Deprivation makes a difference with children in more deprived wards seeing a bigger increase in attendance - but only to a point

2.2 Educational Psychology

This involvement marks the child’s placement on the caseload of an Educational Psychologist. There are around a thousand of these in the city each year:

The majority of these are for children who require SEN support, and those with speech, language and communication needs:

Educational Psychology
counts of children with involvements in 2024, by SEN level
SEN Support	475
No SEN	296
NA	102
EHCP	49

Educational Psychology
counts of children with involvements in 2024, by primary specific need
NA	407
Speech, Language And Communication Needs	260
Autistic Spectrum Disorder	89
Social, Emotional and Mental Health	85
Moderate Learning Difficulty	21
Physical Disability	11
Other	10
Hearing Impairment	9
Specific Learning Difficulty	7
No Specialist Assessment	6
Visual Impairment	6
Profound And Multiple Learning Difficulty	4
Severe Learning Difficulty	4
Behaviour, Emotional And Social Difficulty	2
Multi Sensory Impairment	1

The two year run up to the start of an Educational Psychology involvement shows a quite steep decline in average attendance, and the involvement startis associated with a turnaround in the direction of travel.

Exclusion rates show a rapid and sustained drop following Educational Psychology involvement:

The coded reasons in the year before & after educational psychology doesn’t reveal any movement beyond what we see in the random sampling, so is not included here. The categorised attendance analysis also doesn’t reveal any significant impacts of Educational Psychologist involvement.

Finally, we also looked at a ‘with & without’ analysis, tracking children with various SEN levels and needs, and children in deprived areas, but across all of these variables children with the Ed. Psych. invovlement consistently attend lower than those without it - and clearly there are factors at work beyond what we have access to in the data.

2.3 Attendance advice

844 attendance advice involvements were started in 2024:

The age profile shows children of all ages receiving this involvement, but with a peak at ages 13 - 15.

Calculating the average (mean) attendance in the terms prior to, and following the opening of attendance advice involvement, we see the pattern below. Average attendance declines steeply towards the point of intervention. After the involvement starts we see a turnaround, though attendence does not recover back to levels seen two years prior (given the overall 4 year timescale here, this can perhaps be attributed to the tendency of attendance to reduce with age)

The categorised attendance analysis shows more detail than the overall average, as we get steady growth in both those entirely missing ‘NA’ and the highest attendance bracket, while the severe absence category steadily decreases. Here those leaving school or at preschool are removed:

Another approach is to compare the relative effectiveness of attendance advice on groups of children with different characteristics. Attendance advice is associated with a reduction in attendance year on year (and this remains true across all groups) but if we look half-term to half-term there are groups who see a net improvement:

Attendance advice makes more of a difference in younger children.

There is a big difference in gender - with girls seeing a net increase at the half term level, and boys seeing an overall decrease

There is also a trend by deprivation - children living in poorer areas of the city see a net increase whereas children living in more affluent wards see a decrease on average.

It’s worth also looking at the school level - since attendance advice is essentially a notification from the authority to the school, and the school’s response to this may vary. The confidence intervals (shown as error bars here) are very wide for this data, but there are big differences between schools that may be indicative of different responses to the attendance advice intervention:

Finally for Attendance Advice, looking at the coded reasons a full year either side of involvement only shows increases across various absence reasons, but there are small improvements at the half term level, mostly in terms of illness and ‘no reason’ absences:

2.4 Reduced timetable

Reduced Timetable volumes

Reduced timetables are associated with a reduction in all absence codes:

The categorised attendance data reveals large increases in the 0-50% attendance bracked, presumably as a result of the reduced timetable arrangement. There are no lasting improvements to the higher attendance brackets:

2.5 Current EHCP

The ‘Current EHCP’ involvement simply tracks children with an Education, Health & Care Plan. And although not an intervention in & of itself, it is a plan to meet the child’s needs, so will be associated with other changes.

There are around 800 EHC of these starting each year - though changes in demand & the council’s response to that demand, may impact volumes. 2023 seems to have been a peak year.

The current EHCP start date sees a sustained increase in average attendance levels:

The EHCP start date is associated with a reduction in almost all coded absence reasons:

The categorised attendance plot shows a sustained increase in all higher attendendance brackets, and a reduction in the severaly absent bracket:

2.6 Autism Team

We start around 600 of these involvements per year:

The ‘before & after’ plot only shows a small change in attendance levels. But since the team become involved around age 5 most children do not have much of a ‘before’ period to consider.

In the coded year-on-year veiw, the Autism team start date is associated with an increase in almost all types of absence.

The assumption here has to be that these year on year increases reflect the steeper line seen in the ‘with & without’ analysis below, and are explained by the same greater severity of need.

To create a “with vs without the team” analysis, we take all children in the attendance data, with a primary specific need of Autism at some point in the attendance data (so we’re taking a “lifetime diagnosis” approach). This cohort is then categorised into those who had involvement from the team and those who did not. We see a consistently lower attendance for children involved with the team, as well as a consistently steeper drop in attendance.

Once again, we have to assume that this reflects differences in the severity of need - with the team becoming involved in the more severe cases - and that there are factors at work here that are not recorded in the data.

But before we move on it’s worth considering the age on in involvement start date. The autism team generally become involved around age 5, but can be at any age - and the team can become involved before a formal diagnosis is in place:

If we categorise children according to their age on first involvement date, there are some interesting differences in patterns of attendance. Here we’ve created three groups: those aged 5 and under when the team first become involved, those aged 6 to 10, and those aged 11+.

2016 seems to have been a peak year for all age groups (which may be a data quality issue) but 2016 aside, the voumes of all 3 of theses groups have increased recently:

Plotting attendance by national curriculum year for these three groups shows that children who have inolvement with the team earlier in life have better attendance throughout their school career than those who have team involvement starting later. This may result from the actions & support of the team, or perhaps the impact of having a diagnosis at different ages, or it may result from effects upstream of these differences in diagnosis age, none of which we can unpick from the data, but the effect is there:

2.7 Inclusion Advice

Inclusion advice involvements are associated with a reduction in exclusion rates, but exclusion rates persist for these children, remaining far above the average rate (shown here with a dotted line)

Inclusion advice is associated with an increase in “no reason” absences reasons, and absences due to exclusions. This fits with what we see above - exclusion rates are higher year on year for children who receive inclusion advice.

2.8 Elective Home Education (EHE)

Warning

It seems unclear from the data whether the attendance data following an EHE involvement is attendance back at school following a short period of home education, or if it shows attendance in the home education setting. But do home educators submit attendance data?

Because of this uncertainty, the analysis included here is limited, but from the available data it seems a period of EHE results in increased attendance at school.

2.9 Early Years Inclusion

The Early Years inclusion team work almost entirely with children younger than school age, so a ‘before & after’ approach will not work here.

Since the team work with SEN children, it makes sense to compare the attendance levels of children who require SEN support or an EHCP plan, but with and without the EY inclusion team involvement. In both cases, (after year 0 at least) we see better attendance in all years when the team are involved. (note that the ‘with the team’ line here cuts off in year )

2.10 Exclusion Concern

This involvement identifies children with high levels of temporary and permanent exclusions. This is growing over time, presumably as exclusion rates are also growing.

The focus here is exclusions, and looking at the before & after plot for exclusions, we see extremely high exclusion rates prior to involvement, and a significant reduction afterward, but rates continue to be well above average:

Exclusion concern involvements often correllated with other involvement types - inclusion panel, think for the future and reduced timetables.

2.11 PA Cohort Tracking

This involvement simply means the child is persistently absent, and is being tracked as such. (see also SA cohort tracking, below)

The PA cohort tracking involvement shows no particular improvement in attendance levels:

2.12 Consultation

N.B. At the time of writing this involvement is not understood - we had feedback that it was not in use in late 2023, but there are volumes of activity in 2024 and 2025.

The “Consultation” involvement before & after plot shows very low attendance prior to involvement, and the trend continuing - though we have to consider that activity prior to 2024 is very limited, and so there simply isn’t enough elapsed time to give data here.

2.13 Progressions Team

The progressions team are involved in matching children to alternative provision.

Progressions Team
counts of children with involvements in 2024, by SEN level
No SEN	653
SEN Support	354
EHCP	40
NA	25

There is no improvement to attendance levels associated with progressions team involvement:

The progressions team involvement is associated with an increase in ‘no reason’ absences and absences due to exclusion:

2.14 Penalty notice warning letter

Penalty notice warning letters don’t appear to have any effect on overall attendance (this is in contrast to the Section 437 notices covered later on):

Penalty notice warning letters are associated with year on year increases in most coded absence reasons, particularly ‘no reason’ absences:

2.15 SA Cohort Tracking

This involvement simply means the child is severely absent, and is being tracked as such. (see also PA cohort tracking, above)

Children with the SA cohort tracking involvement show a general improvement in attendance levels:

2.16 Parenting

The parenting data comes via the Early Help data model, and is mapped onto attendance data like the other involvement types. These are children whose parents are enrolled on parenting programmes. :::{.callout-warning} N.B. the volumes here drop off in 2022, and work is needed to check we’re pulling in all available data. :::

Parenting programme don’t appear to have any effect on overall attendance, indeed average levels continue to decline after the involvement start:

Looking at the coded reasons, there are no shifts here that are not comparable with those seen in the random samples - parenting programmes seem to have no significant effects:

2.17 Other

N.B. At the time of writing, I don’t know what this actually means. There is an involvement cohort name of simply “other”, with over 150 involvements starting in 2024:

“Other” type involvements see a substantial change in average attendance levels:

Looking at the coded reasons, there are no significant year on year shifts associated with the ‘Other’ type involvements.

2.18 Hearing Impairment Team

Involvement with the Hearing Impairment team has no apparent effect on attendance - which for this cohort is healthly throughout.

The analysis of coded reasons shows no changes that are not comparable with the background changes seen in the random samples, and so is not included here.

Taking children with a primary specific need of Hearing Impairment, we can do a ‘with & without team involvement’ analysis. This shows consistently lower attendance through all school years, for children with the team involvement. Presumably this reflects differences in severity of need within the cohort of hearing impaired children.

2.19 Inclusion & Attendance Y4 / Inclusion & Attendance Y9

2.19.1 I&A Y4

The Inclusion & Attendance Y4 team get involved from Y4 onwards in order to assist with the transition to secondary school in Y6-7 which is generally associated with a significant drop in attendance. Activity is fairly low in recent years, with only about 40 involvements starting each year.

The team work more with children in more deprived areas, and those who require SEN support.

Inclusion & Attendance Y4 team
counts of children with involvements in 2024, by SEN level
No SEN	304
SEN Support	243
EHCP	15

Inclusion & Attendance Y4 team - counts by IMD quartile
4 = most deprived wards; 1 = least deprived
1	34
2	105
3	157
4	264

And the children who the I&A Y4 team work with tend to have lower attendance throughout primary school

Since the team’s aim is to assist with the transition to secondary school (Y6 to Y7), it makes sense to compare this transition point for children who have involvement with those who do not. We’ll do so across a some other factors, like SEN levels, deprivation and prior attendance levels.

The charts below show this analysis from years 4 to 8 for all pupils, those with SEN support, EHCP, those in the most deprived wards of the city, and those who were severely absent in Y4 or in Y6:

The above plots show that attendance throughout years 4 to 7 is consistently lower for all children the team work with. And looking at the transition to secondary school for all pupils on average, for children on SEN support and for children in the most deprived wards, the drop from Y6 to 7 is more severe among children the team work with. This presumably represents the targetting of this service to children with the highest needs, according to factors not visible in the data.

For children with an EHC plan, and children who were severely absent in Y4 or Y7, when the team are involved attendance actually improves into Y7. The plots below show the net difference in overall % attendance between Y6 and y7 for the same groups given above, again split by whether the I&A Y4 team were involved:

We should also look at exclusion rates for the Y4 team

2.19.2 I&A - Y9

Overall, children involved with the team show lower attendance through secondary school than those without:

We’ll take the same approach for the Y9 team as we did for Y4 - comparing attendance with & without the team across a number of characteristic groups. This shows a similar pattern to what we saw with the I&A Y4 team - at the Y9 to Y10 boundary, children involved with the team show a more severe drop in attendance than those without, across all the groups we’ve looked at here. The unfortunate conclusion is that we can’t say much about the team’s effectiveness from our data, and we have to assume that there are factors at work that are not present in the data.

Exclusion rates beofre & after the I&A Y9 team involvement

2.20 I&A - Complex SEND

Note

Although the team is called ‘Complex SEND’ almost half of the children involved with the team do not show up as having any special educational needs, according to the school census data.

Inclusion & Attendance Complex SEND
counts of children with involvements in 2024, by SEN level
No SEN	217
SEN Support	211
EHCP	63

Inclusion & Attendance Complex SEND
counts of children with involvements in 2024, by SEN level
NA	242
Speech, Language And Communication Needs	80
Autistic Spectrum Disorder	56
Social, Emotional and Mental Health	43
Moderate Learning Difficulty	16
Behaviour, Emotional And Social Difficulty	15
Specific Learning Difficulty	12
Other	9
Profound And Multiple Learning Difficulty	6
Hearing Impairment	4
Severe Learning Difficulty	4
Physical Disability	2
No Specialist Assessment	1
Visual Impairment	1

The I&A complex SEND team involvement shows a sustained improvement in attendance levels:

Although the direction of travel changes for average attendance, when we look at coded absences for the year either side of the involvement start, overall absences increase. Though it’s worth considering that children with complex SEND show a general decline in average attendance beyond the decline for other children. There are also small improvements in lateness, illness & exclusion levels here:

We’ve also looked at a ‘with & without’ analysis, for both children with SEN support and an EHC plan. In both cases, children involved with the team have consistently lower attendance than those without. This must reflect the ‘complex’ nature of the cohort’s needs.

Exclusions

2.21 Non PIP/SIP - Think for the Future

Think for the future is focussed on behaviour, resilience & inclusion. There are around 100 of these involvements each year, for children of all school ages, but mostly around age 11.

The focus here is on exclusions, but for completion we’ll also show the attendance plot. The children receiving this involvement show very poor attendance, which continues to worsen after the involvement:

This change in average attendance is driven by growth in those severely absent - 0-50% attendance; the orange bars here:

Looking at exclusions, you can see the extremely high exclusion rates immediately prior to involvement here, and although there is some reduction in the average rate, it’s slight and over a long time.

Finally for this involvement, the coded reasons show a significant year on year increase in overall absences, driven mostly by an increase in the ‘no reason’ category:

2.22 Non PIP/SIP - Theraputic Outreach

Tip

The volumes and age profile, categorised attendance profile and before & after plots here all look very similar to the non PIP SIP think for the future involvements given above.

Analysis shows that these are the exact same cohort of pupils! Is this an issue with the data, or to be expected?

There are around 100 involvements starting per year with the age profile peaking at 11:

Attendance levels continue to decline after non PIP/SIP theraputic outreach:

There is also no improvement in rates of exclusion absences for children with this involvement, with rates remaining well above average following involvement start:

2.23 SIP - Theraputic Outreach

SIP theraputic outreach involvements are associated with a continued decline in attendance:

SIP theraputic outreach involvements are associated with a significant improvement in exclusion rates:

2.24 PIP - Theraputic Outreach

After PIP theraputic outreach involvements attendance is very poor - although the picture here is better than at Secondary Inclusion Panel theraputic outreach, and shows a slow improvement:

Though rates do not return to previous levels or the overall average, PIP theraputic outreach involvements are associated with a significant improvement in exclusion rates:

2.25 Secondary Inclusion Panel

Caution

As with Non PIP/SIP Think for the Future and Non PIP/SIP Theraputic Outreach, SIP theraputic outreach and Secondary Inclusion Panel seem to be the exact same cohort of pupils

TO DO - check this in the source data

Secondary Inclusion Panel involvements are associated with a continued decline in attendance:

Secondary Inclusion Panel involvements are associated with a significant improvement in exclusion rates:

2.26 Primary Inclusion Panel

Caution

As with several other pairs of involvements, PIP and PIP theraputic outreach are the exact same cohort

Primary Inclusion Panel involvements are associated with a continued decline in attendance:

Primary Inclusion Panel involvements are associated with a significant improvement in exclusion rates:

2.27 PNOR 50+

Volumes of this involvement are fairly low, and many do not appear to have a Date of Birth recorded, so the age profile could not be calculated.

Volumes of this involvement prior to 2024 are very low, and the involvement generally pertains to older children, and specifically to those not on roll. There is no attendance data after involvement start date for this cohort:

The stacked plot is useful to understand how recent these involvements are, how much of the attendnance data is either not present (or NA - the red bars here) or out of scope (black bars) meaning this relates to periods of time that have not yet occurred, or for which the data has not been released:

2.28 I&A - Vulnerable Learner

I&A - Vulnerable Learner involvements are associated with very low attendance levels, and show no apparent improvement once the involvement is in place:

The categorised stacked plot reveals more of what’s happening with this cohort - some are of leaving age, but more become just missing from the data. There are also large volumes who become severely absent (0-50% - the orange bars here) after the involvement date.

2.29 Portage

Portage service is associated with very young children with special educational needs.

The team work more with largely children with EHC plans, and across a range of needs, principally those who require SEN support, principally speech, language & communication needs and autism.

Portage
counts of children with involvements in 2024, by SEN level
SEN Support	388
EHCP	334
No SEN	74

Portage - counts by IMD quartile
4 = most deprived wards; 1 = least deprived
1	181
2	256
3	276
4	280

Portage
counts of children with involvements in 2024, by primary specific_need
NA	329
Speech, Language And Communication Needs	243
Autistic Spectrum Disorder	195
Severe Learning Difficulty	64
Physical Disability	62
Profound And Multiple Learning Difficulty	48
Social, Emotional and Mental Health	39
Moderate Learning Difficulty	23
Specific Learning Difficulty	19
Other	17
Hearing Impairment	4
Visual Impairment	4
Multi Sensory Impairment	2
No Specialist Assessment	2

For portage we’ve taken attendance data from year 0 to year 6 for children with EHC plan; SEN support; Speech, Language and Communication Needs, and Autism, in each case comparing children with & without a Portage involvement. The differences here are small but children with SEN support have better attendance in all primary years except year 1 when involved with the service. There is also evidence that children with Autism show better attendance. For children with SLCN (which is the most prevalent primary need of children involved with the service) or an EHC plan, the picture is quite mixed, with no clear evidence of a significantly better attendance. (Year 0 is removed here due to low availability of data)

2.30 Rowan outreach

The majority of children with the Rowan outreach involvement have primary needs of Autism or Speech, Language & Communication needs:

Rowan outreach
counts of children with involvements in 2024, by primary specific_need
Autistic Spectrum Disorder	113
Speech, Language And Communication Needs	61
Social, Emotional and Mental Health	24
NA	21
Profound And Multiple Learning Difficulty	2
Hearing Impairment	1
Moderate Learning Difficulty	1
Multi Sensory Impairment	1
No Specialist Assessment	1

The ‘before & after’ plot for Rowan outreach shows a change in average direction, but a slightly messy picture, since there isn’t a lot of full “before & after” data available:

Given that Rowan outreach work with children at or before school readiness age, it makes sense to look at the same primary school attendance profile that we did for Portage above.

Here year 0 is removed due to very small data availability. Although generally children involved with the outreach team show lower attendance than those without, children with Speech, Language & Communication Needs, children with Autism, and children with an EHC plan all show significant and consistent improvement in attendance through years 1 to 3 when involved with the team.

2.31 Visual Impairment team

Involvement with the Visual Impairment team is associated with a small improvement in attendance:

The stacked categorised plot shows an improvement in the highest attendance category over the first year of involvement with the team:

As with the Hearing Impairment team, if we compare attendance levels for children with a Visual Impairment with and without the team involvement, we see consistently lower attendance through all school years for children involved with the team. This presumably reflects a higher level of need, or other factors. Children involved with the team do show improvements in attendance levels through primary school:

2.32 UCAN

(this description copied from Sheffield Directory) Sheffield Early Years Language Centre (Ucan) is funded jointly by the NHS and Sheffield Local Authority. The centre is staffed by speech and language therapists from the NHS and a teacher and assistant from the 0-5 SEND Support Service.

The centre provides intensive early intervention for pre-school children with identified developmental language disorder (DLD) and training for parents and Early Years practitioners in meeting children’s needs.(Please see more detailed information below about admissions criteria and the work of the centre).

Since involvement with UCAN is exclusively prior to school starting age, we cannot do the ‘before & after’ analysis.

The majority of children involved with UCAN have special educational needs:

Portage
counts of children with involvements in 2024, by SEN level
SEN Support	388
EHCP	334
No SEN	74

UCAN - counts by IMD quartile
4 = most deprived wards; 1 = least deprived
1	15
2	22
3	32
4	25

The majority have speech, language and communication needs:

UCAN
counts of children with involvements in 2024, by primary specific_need
NA	53
Speech, Language And Communication Needs	37
Autistic Spectrum Disorder	1
Physical Disability	1
Severe Learning Difficulty	1
Social, Emotional and Mental Health	1

Some of this may be due to the recency of the UCAN involvements and the general recovery in attendance for younger children, but the data shows that children with UCAN involvement show better attendance in primary than their peers who have similar needs:

2.33 I&A - School Readiness

Warning

Looking at the age profile, there is clearly a data quality issue with some of these involvements.

The team work more with children in more deprived areas, and those who require SEN support.

I&A - School Readiness
counts of children with involvements in 2024, by SEN level
SEN Support	270
EHCP	146
No SEN	94

I&A - School Readiness - counts by IMD quartile
4 = most deprived wards; 1 = least deprived
4	230
3	166
2	124
1	51

I&A - School Readiness
counts of children with involvements in 2024, by primary specific_need
NA	171
Speech, Language And Communication Needs	163
Autistic Spectrum Disorder	136
Social, Emotional and Mental Health	72
Moderate Learning Difficulty	9
Other	8
Specific Learning Difficulty	6
Physical Disability	5
Severe Learning Difficulty	5
Hearing Impairment	4
Visual Impairment	3
Profound And Multiple Learning Difficulty	2
No Specialist Assessment	1

If we look at all pupils, we see that children who the school readiness team work with tend to have lower attendance throughout primary school, including a bigger dropoff into Y6. Children involved with the team show an overall sustained improvement in their attendance through primary school:

Breaking this down for the same characteristic groups we used for various other involvements above, we see generally lower attendance for the children accessing the service, which probably reflects factors not captured in the data - i.e. significant need beyond what we’re controlling for here. The exception seems to be children with Autism, who appear to significantly benefit in attendance.

2.34 S437- School Att Order EHE

Attendance orders related to electively home educated children. In 2024 Sheffield City Council has just 35 of these recorded. The orders can be issued to the parents of children of any age.

S437 Attendance Orders are followed by a sudden and significant improvement in attendance levels.

Behind those averages though, there are still large volumes of children with entirely missing attendance data - are these those still being home educated?

2.35 S437 - School Att Order CME

School attendance orders relating to a Child Missing Education (CME). As with the EHE attendance orders, Sheffield issues just a few dozen of these per year. CME orders or mostly issued to parents of secondary school pupils:

In the ‘before & after’ analysis, we see that prior half terms are blank due to children missing school entirely for this period - hence the attendance order. Average attendance after the attendance orders improves dramatically (but remains below overall average levels).

(note that some half term -1 data here was removed, presumed incorrectly indexed on the wrong date)

All coded absence reasons show a large decrease following the CME attendance orders:

The categorised attendance data shows a dramatic change following this involvement, with a big reduction in the NA category and increases across all ‘present’ categories - though there are also signs that effects have a limited timespan, as the % in the highest attendance category drops away, and the NA category increases once again.

2.36 I&A - Reintegration

The plot above shows a significant shift immedately after the involvement. The categorised reasons show a similar shift whether we look at it half-term to half-term or year on year - a reduction in overall absences, mostly driven by a reduction in exclusions, while illness & lateness levels rise slightly:

Finally, the stacked categorised bar chart shows more detail than the average before & after plot - with the most significant movement being between those severely absent and those in the highest attendance bracket (although these fall away over time):

2.37 School Attendance Order Breach

Finally, the stacked categorised bar chart shows how, despite the increase in average attendance, the majority of children are absent from the attendance records entirely both before and after the attendance order breach involvement:

2.38 GP protocol

There is no apparent affect on attendance levels from this involvement:

The categorised reason plot is not included here, as it shows no particular movement. However, although there is no change in overall average attendance, there is significant movement on the stacked bar plot of attendance brackets. This happens immediately following the involvement, with growth in both the highest and lowest attendance brackets:

2.39 Nurture

There are five different involvement types here all under the banner of ‘Nurture’. It looks like these cover different sites in the city, but the assumption is that they are all broadly the same service and so are covered here together.

Those 5 involvement types are: Nurture - BLT Hinde House Nurture - BLT Yewlands Nurture - Bumblebee Nurture - BLT Earl Marshall Nurture - Step Out

Volumes are quite low and except for Bumblebee much of the activity is very recent:

There are differences in the age profile of the different nurture type involvements.

Nurture type involvements show continued low attendance after involvment - but with some signs of improvement over time:

Nurture type involvements show and immediate and dramatic reduction in exclusion rates:

And returning to attendance, the stacked bar plot of attendance brackets is more revealing than the average attendance levels. Preschool half term periods and some COVID periods have been removed here. Although the overall average doesn’t change, the involvment start date sees a big reduction in the worst attendance brackets (0 - 50%) and growth in both the highest (90% +) and middle attendance brackets (50 - 80%).

2.40 Managed moves

The volumes available for managed moves are very low:

Because the volumes are so small, the confidence intervals here are wide. Managed Moves are associated with a continuing decline in attendance levels:

Looking at coded reasons before & after a managed move, the overall increase in absence seems to be driven by a an increase in ‘no reason’ absences:

--- title: "Attendance and Inclusion Services and Interventions" author: "Giles Robinson" date: 2024-07-09 editor: visual format: html: code-tools: true #self-contained: true #code-fold: true toc: true toc-location: left toc-depth: 4 toc-fold: false number-sections: true fig-cap-location: top #embed-resources: true fig-height: 4 other-links: - text: Back to SCC Data Science site home href: https://scc-data-science.sheffield.gov.uk/ execute: warning: false message: false echo: false knitr: opts_chunk: out.width: "80%" fig.align: center --- # Setup {.unnumbered} ```{r} #| label: setup # knitr::opts_chunk$set(echo = TRUE, warning = FALSE, message = FALSE) # clear the environment remove(list = ls()) # load packages library(tidyverse) library(janitor) library(lubridate) library(ggtext) library(ggrepel) library(gghighlight) library(kableExtra) library(MetBrewer) library(corrplot) library(ggcorrplot) #library(shadowtext) library(readxl) library(ggstatsplot) library(geosphere) library(ggridges) library(forecast) library(tsibble) library(gt) # specify data folder data_folder <- str_c("S:/Public Health/Policy Performance Communications/Business Intelligence/Projects/EIP/data/inclusion/") # copy to excel function copy_excel <- function(input) {write.table(input, file = "clipboard-50000", sep = "\t", row.names = F)} # ggplot themes eb <- element_blank() # Set default ggplot theme theme_set( theme_classic() + theme( #plot.title = element_text(), plot.subtitle = element_text(size = 10, face = "italic"), plot.caption = element_text(size = 9, face = "italic"), plot.title.position = "plot", plot.title = element_markdown(size = 12) ) ) # theme for minimal bar charts barplottheme_minimal <- theme( axis.title.y = eb, axis.line.y = eb, axis.ticks.y = eb, axis.line.x = eb, axis.ticks.x = eb ) gannt_theme <- theme_classic() + theme( plot.title = element_text(size = 12), plot.subtitle = element_text(size = 8, face = "italic"), plot.caption = element_text(size = 8, face = "italic"), plot.title.position = "plot", axis.title = eb, axis.line.y = eb, axis.ticks.y = eb, axis.text.y = eb, legend.position = "right", legend.title = eb, legend.text = element_text(size = 8) ) # stack colours for the categorised attendance charts stack_colours <- c( "NA" = "#d73027", "0 - 50%" = "#fc8d59", "50 - 80%" = "#fee08b", "80 - 90%" = "#d9ef8b", "90 - 100%" = "#91cf60", "preschool" = "#aed6f1", "leaving age" = "grey60", "covid" = "#af7ac5", "out of scope" = "grey20" ) # summarising attendance function # this is copied from the attendance & exclusion data model. # any changes made there should be reflected here & vice versa # note that the groupings appear TWICE in this function, once for grouped data and once for the "no grouping" scenario (grouping_vars = "none"). Any changes must be consistent across both. summarise_attendance <- function(input_data, grouping_vars) { ifelse (grouping_vars == "none", { # Aggregate without grouping result <- input_data |> mutate(zero_attendance = if_else(present == 0, 1, 0)) |> summarise(child_count = n_distinct(stud_id, na.rm = TRUE), row_count = n(), possible_sessions = sum(possible_sessions, na.rm = TRUE), present = sum(present, na.rm = TRUE), authorised = sum(authorised, na.rm = TRUE), unauthorised = sum(unauthorised, na.rm = TRUE), missing = sum(missing, na.rm = TRUE), excluded = sum(excluded, na.rm = TRUE), family_holiday_agreed = sum(family_holiday_agreed, na.rm = TRUE), family_holiday_not_agreed = sum(family_holiday_not_agreed, na.rm = TRUE), family_holiday_total = sum(family_holiday_total, na.rm = TRUE), illness = sum(illness, na.rm = TRUE), med_appt = sum(med_appt, na.rm = TRUE), no_reason = sum(no_reason, na.rm = TRUE), late_absent = sum(late_absent, na.rm = TRUE), late_pres = sum(late_pres, na.rm = TRUE), late_total = sum(late_absent, na.rm = TRUE) + sum(late_pres, na.rm = TRUE), study_leave = sum(study_leave, na.rm = TRUE), approved_offsite = sum(approved_offsite, na.rm = TRUE), fixed_exclusions = sum(fixed_exclusions, na.rm = TRUE), perm_exclusions = sum(perm_exclusions, na.rm = TRUE), total_exclusions = sum(total_exclusions, na.rm = TRUE), persistent_absent_count = sum(persistent_absence, na.rm = TRUE), severe_absent_count = sum(severe_absence, na.rm = TRUE), zero_attendance_count = sum(zero_attendance, na.rm = TRUE) ) |> mutate(percent_of_pupils = child_count / sum(child_count, na.rm = TRUE), percent_present = present / possible_sessions, percent_auth_absence = authorised / possible_sessions, percent_unauth_absence = unauthorised / possible_sessions, percent_missing = missing / possible_sessions, percent_family_holiday_agreed = family_holiday_agreed / possible_sessions, percent_family_holiday_not_agreed = family_holiday_not_agreed / possible_sessions, percent_family_holiday = family_holiday_total / possible_sessions, percent_excluded = excluded / possible_sessions, percent_illness = illness / possible_sessions, percent_med_appt = med_appt / possible_sessions, percent_no_reason = no_reason / possible_sessions, percent_late_absent = late_absent / possible_sessions, percent_late_pres = late_pres / possible_sessions, percent_late_total = late_total / possible_sessions, percent_study_leave = study_leave / possible_sessions, percent_approved_offsite = approved_offsite / possible_sessions, pc_of_pupils_persistent_absent = persistent_absent_count / row_count, pc_of_pupils_severely_absent = severe_absent_count / row_count, pc_of_pupils_zero_attendance = zero_attendance_count / row_count ) |> mutate(percent_absent = 1 - percent_present) }, { # Group by specified variables and then summarize result <- input_data |> mutate(zero_attendance = if_else(present == 0, 1, 0)) |> group_by(across(all_of(grouping_vars))) |> summarise(child_count = n_distinct(stud_id, na.rm = TRUE), row_count = n(), possible_sessions = sum(possible_sessions, na.rm = TRUE), present = sum(present, na.rm = TRUE), authorised = sum(authorised, na.rm = TRUE), unauthorised = sum(unauthorised, na.rm = TRUE), missing = sum(missing, na.rm = TRUE), excluded = sum(excluded, na.rm = TRUE), family_holiday_agreed = sum(family_holiday_agreed, na.rm = TRUE), family_holiday_not_agreed = sum(family_holiday_not_agreed, na.rm = TRUE), family_holiday_total = sum(family_holiday_total, na.rm = TRUE), illness = sum(illness, na.rm = TRUE), med_appt = sum(med_appt, na.rm = TRUE), no_reason = sum(no_reason, na.rm = TRUE), late_absent = sum(late_absent, na.rm = TRUE), late_pres = sum(late_pres, na.rm = TRUE), late_total = sum(late_absent, na.rm = TRUE) + sum(late_pres, na.rm = TRUE), study_leave = sum(study_leave, na.rm = TRUE), approved_offsite = sum(approved_offsite, na.rm = TRUE), fixed_exclusions = sum(fixed_exclusions, na.rm = TRUE), perm_exclusions = sum(perm_exclusions, na.rm = TRUE), total_exclusions = sum(total_exclusions, na.rm = TRUE), persistent_absent_count = sum(persistent_absence, na.rm = TRUE), severe_absent_count = sum(severe_absence, na.rm = TRUE), zero_attendance_count = sum(zero_attendance, na.rm = TRUE) ) |> mutate(percent_of_pupils = child_count / sum(child_count, na.rm = TRUE), percent_present = present / possible_sessions, percent_auth_absence = authorised / possible_sessions, percent_unauth_absence = unauthorised / possible_sessions, percent_missing = missing / possible_sessions, percent_family_holiday_agreed = family_holiday_agreed / possible_sessions, percent_family_holiday_not_agreed = family_holiday_not_agreed / possible_sessions, percent_family_holiday = family_holiday_total / possible_sessions, percent_excluded = excluded / possible_sessions, percent_illness = illness / possible_sessions, percent_med_appt = med_appt / possible_sessions, percent_no_reason = no_reason / possible_sessions, percent_late_absent = late_absent / possible_sessions, percent_late_pres = late_pres / possible_sessions, percent_late_total = late_total / possible_sessions, percent_study_leave = study_leave / possible_sessions, percent_approved_offsite = approved_offsite / possible_sessions, pc_of_pupils_persistent_absent = persistent_absent_count / row_count, pc_of_pupils_severely_absent = severe_absent_count / row_count, pc_of_pupils_zero_attendance = zero_attendance_count / row_count )|> mutate(percent_absent = 1 - percent_present) } ) return(result) } summarise_avg <- function(input_data) { summarise (input_data, mean.percent_present = mean(percent_present, na.rm = TRUE), sd.percent_present = sd(percent_present, na.rm = TRUE), n.percent_present = n() ) |> mutate(se.percent_present = sd.percent_present / sqrt(n.percent_present), lower.ci.percent_present = mean.percent_present - qt(1 - (0.05 / 2), n.percent_present - 1) * se.percent_present, upper.ci.percent_present = mean.percent_present + qt(1 - (0.05 / 2), n.percent_present - 1) * se.percent_present ) } percent_calc <- function(input_data) {input_data |> tally() |> mutate(freq = n / sum(n)) |> mutate( l_ci = freq - (1.96 * sqrt((freq * (1 - freq)) / n)), u_ci = freq + (1.96 * sqrt((freq * (1 - freq)) / n)) )} presence_mean_calc <- function(input_data) {input_data |> summarise(mean.percent_present = mean(percent_present, na.rm = TRUE), sd.percent_present = sd(percent_present, na.rm = TRUE), n.percent_present = n() ) |> mutate(se.percent_present = sd.percent_present / sqrt(n.percent_present), lower.ci.percent_present = mean.percent_present - qt(1 - (0.05 / 2), n.percent_present - 1) * se.percent_present, upper.ci.percent_present = mean.percent_present + qt(1 - (0.05 / 2), n.percent_present - 1) * se.percent_present) } # define summarise before after function - this calculates the mean difference +- 95 CI summarise_before_after <- function(input_data) { summarise (input_data, mean.diff = mean(diff, na.rm = TRUE), sd.diff = sd(diff, na.rm = TRUE), n.diff = n() ) |> mutate(se.diff = sd.diff / sqrt(n.diff), lower.ci.diff = mean.diff - qt(1 - (0.05 / 2), n.diff - 1) * se.diff, upper.ci.diff = mean.diff + qt(1 - (0.05 / 2), n.diff - 1) * se.diff ) } # function to generate annual codified before & after involvement summary data before_after_year_fun <- function( input_data = inv_summary, cohort_name_selection) { mean <- inv_summary |> filter(cohort_name == cohort_name_selection) |> ungroup() |> select(contains("mean")) |> select(contains("year")) |> pivot_longer(everything(), values_to = "mean", names_to = "name") |> filter(!str_detect(name, "close")) |> filter(!str_detect(name,"present")) |> mutate(prior_post = if_else(str_detect(name, "prior") == TRUE, "prior","post")) |> mutate(prior_post = factor(prior_post, levels = c("prior","post"))) |> mutate(name = str_replace(name, "mean_percent_","")) |> mutate(name = str_replace(name, "_prior","")) |> mutate(name = str_replace(name, "_post","")) |> mutate(name = str_replace(name, "_start","")) |> mutate(name = str_replace(name, "_year","")) |> mutate(name = case_when(name == "late_pres" ~ "late present", TRUE ~ str_replace(name,"_"," "))) |> mutate(name = if_else(name == "absent", "total absences", name)) upper_ci <- inv_summary |> filter(cohort_name == cohort_name_selection) |> ungroup() |> select(contains("upper_ci")) |> select(contains("year")) |> pivot_longer(everything(), values_to = "upper_ci", names_to = "name") |> filter(!str_detect(name, "close")) |> #filter(!str_detect(name,"present")) |> mutate(prior_post = if_else(str_detect(name, "prior") == TRUE, "prior","post")) |> mutate(prior_post = factor(prior_post, levels = c("prior","post"))) |> mutate(name = str_replace(name, "upper_ci_percent_","")) |> mutate(name = str_replace(name, "_prior","")) |> mutate(name = str_replace(name, "_post","")) |> mutate(name = str_replace(name, "_start","")) |> mutate(name = str_replace(name, "_year","")) |> mutate(name = case_when(name == "late_pres" ~ "late present", TRUE ~ str_replace(name,"_"," "))) |> mutate(name = if_else(name == "absent", "total absences", name)) lower_ci <- inv_summary |> filter(cohort_name == cohort_name_selection) |> ungroup() |> select(contains("lower_ci")) |> select(contains("year")) |> pivot_longer(everything(), values_to = "lower_ci", names_to = "name") |> filter(!str_detect(name, "close")) |> #filter(!str_detect(name,"present")) |> mutate(prior_post = if_else(str_detect(name, "prior") == TRUE, "prior","post")) |> mutate(prior_post = factor(prior_post, levels = c("prior","post"))) |> mutate(name = str_replace(name, "lower_ci_percent_","")) |> mutate(name = str_replace(name, "_prior","")) |> mutate(name = str_replace(name, "_post","")) |> mutate(name = str_replace(name, "_start","")) |> mutate(name = str_replace(name, "_year","")) |> mutate(name = case_when(name == "late_pres" ~ "late present", TRUE ~ str_replace(name,"_"," "))) |> mutate(name = if_else(name == "absent", "total absences", name)) # join joined_data <- left_join(mean,upper_ci, by = c("name","prior_post")) |> left_join(lower_ci, by = c("name","prior_post")) return(joined_data) } # function to generate half termly codified before & after involvement summary data before_after_ht_fun <- function( input_data = inv_summary, cohort_name_selection) { mean <- inv_summary |> filter(cohort_name == cohort_name_selection) |> ungroup() |> select(contains("mean")) |> select(-contains("10"),-contains("11"),-contains("12")) |> select(contains("prior_1") | contains("post_1")) |> pivot_longer(everything(), values_to = "mean", names_to = "name") |> filter(!str_detect(name, "close")) |> filter(!str_detect(name,"present")) |> mutate(prior_post = if_else(str_detect(name, "prior") == TRUE, "prior","post")) |> mutate(prior_post = factor(prior_post, levels = c("prior","post"))) |> mutate(name = str_replace(name, "mean_percent_","")) |> mutate(name = str_replace(name, "_prior","")) |> mutate(name = str_replace(name, "_post","")) |> mutate(name = str_replace(name, "_start","")) |> mutate(name = str_replace(name, "_year","")) |> mutate(name = str_replace(name,"_1","")) |> mutate(name = case_when(name == "late_pres" ~ "late present", TRUE ~ str_replace(name,"_"," "))) |> mutate(name = if_else(name == "absent", "total absences", name)) upper_ci <- inv_summary |> filter(cohort_name == cohort_name_selection) |> ungroup() |> select(contains("upper_ci")) |> select(-contains("10"),-contains("11"),-contains("12")) |> select(contains("prior_1") | contains("post_1")) |> pivot_longer(everything(), values_to = "upper_ci", names_to = "name") |> filter(!str_detect(name, "close")) |> #filter(!str_detect(name,"present")) |> mutate(prior_post = if_else(str_detect(name, "prior") == TRUE, "prior","post")) |> mutate(prior_post = factor(prior_post, levels = c("prior","post"))) |> mutate(name = str_replace(name, "upper_ci_percent_","")) |> mutate(name = str_replace(name, "_prior","")) |> mutate(name = str_replace(name, "_post","")) |> mutate(name = str_replace(name, "_start","")) |> mutate(name = str_replace(name, "_year","")) |> mutate(name = str_replace(name,"_1","")) |> mutate(name = case_when(name == "late_pres" ~ "late present", TRUE ~ str_replace(name,"_"," "))) |> mutate(name = if_else(name == "absent", "total absences", name)) lower_ci <- inv_summary |> filter(cohort_name == cohort_name_selection) |> ungroup() |> select(contains("lower_ci")) |> select(-contains("10"),-contains("11"),-contains("12")) |> select(contains("prior_1") | contains("post_1")) |> pivot_longer(everything(), values_to = "lower_ci", names_to = "name") |> filter(!str_detect(name, "close")) |> mutate(prior_post = if_else(str_detect(name, "prior") == TRUE, "prior","post")) |> mutate(prior_post = factor(prior_post, levels = c("prior","post"))) |> mutate(name = str_replace(name, "lower_ci_percent_","")) |> mutate(name = str_replace(name, "_prior","")) |> mutate(name = str_replace(name, "_post","")) |> mutate(name = str_replace(name, "_start","")) |> mutate(name = str_replace(name, "_year","")) |> mutate(name = str_replace(name,"_1","")) |> mutate(name = case_when(name == "late_pres" ~ "late present", TRUE ~ str_replace(name,"_"," "))) |> mutate(name = if_else(name == "absent", "total absences", name)) # join joined_data <- left_join(mean,upper_ci, by = c("name","prior_post")) |> left_join(lower_ci, by = c("name","prior_post")) return(joined_data) } ``` ```{r} #| label: load data #| echo: false # load the entire data model load(str_c(data_folder,"attendance_inclusion_data_model.RData")) ``` # Introduction A programme of analytical work was began by the SCC BI team in September 2023, with the aims of understanding the key drivers of school absences in Sheffield and the reach and effectiveness of existing services and interventions. The first of those two requirements is written up the report *Attendance in Sheffield Schools*. This report addresses the second requirement. ::: callout-note ## A note on terminology The terms *services* and *interventions* are generally used interchangeably. Some of the things we're evaluating here might be better classed as events. Even so, in each case we're treating them the same, using the data to evaluate the impact of on attendance. ::: ## Data sources & processing This is only a brief overview of data sources & processing used in this report. Please contact the BI team giles.robinson@sheffield.gov.uk if you require further detail. * Attendance data is held in the PAS Oscar database tables * Involvements are held in PAS Oscar database * Child social care episodes are retrieved from Oscar database but are recording on LiquidLogic children's system. * Other services and interventions which are not stored as involvements (school attendance orders, parenting programmes, managed moves) are also retrieved from the Oscar schemas. * Economic deprivation information is retrieved from local indicators data * Demographics are from the Oscar database * Special Educational Needs data is from the school census * School information is from the Oscar database A separate data model R script retrieves the data listed above and transforms it into clean processed data for analysis in this report script. Crucially for this analysis, attendance data is categorised by DoE attendance codes and calculated as % of available sessions attended and these metrics, and mapped onto the involvements data according to involvement start dates. Average 'before and after' attendance metrics are then calculated. This creates involvement summary which can be generated for any demographic group for which we have data. ## Analytical approach ### Measures of effectiveness & finding suitable comparators {.unnumbered} In this report our aim is to evaluate how effective different things are at increasing school attendance and reducing exclusion rates - though the primary focus is attendance. There are a few difficulties to this: - Services are allocated according to need, and not on a random basis, so we often do not have a "control group" - Attendance tends to reduce over time as children progress through secondary school - so our background rate is (in older children at least) always reducing, and we shouldn't necessarily be surprised to see that our interventions are associated with a reduction in attendance, even if the stated aim is to increase attendance To tackle this we'll take a few different approaches, explained here with example plots: ### Average attendance before & after {.unnumbered} The ideal approach is to compare a baseline period (prior to intervention) to a test period (following intervention) for the same children. We can then see differences in overall attendance and coded absence reasons, and we can see how these differ for different groups. This shows the overall trajectory of changing attendance, and how this is affected by our involvements - though it's important to remember that the points on these charts are just the overall average, behind which lie a lot of variation. For example, this is the plot for Attendance Advice. We see attendance declining prior to involvement and improving afterward. ```{r} #| label: plot avg before and after example attendance advice #| fig-height: 3.5 inv_summary_long_start |> filter(cohort_name == "Attendance Advice") |> ggplot(aes(x = term_index, y = mean.percent_present ) ) + geom_point(colour = "steel blue", size = 2) + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Average attendance before and after intervention - Attendance Advice", subtitle = "Mean percentage of available sessions attended +- 95 CI", caption = "data from Capita One", x = "half term (relative to involvement start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") ``` ::: callout-important The pattern seen here is repeated for many of the interventions covered in this report. The overall trend here changes direction, but attendance levels do not return back to what they were prior to involvement. Moreover, if we add up all the attendance on the left hand side of the chart above and compare it to that on the right, we see a net *negative* change. Attendance advice is associated with a *decrease* in attendance levels. ::: ### Categorised attendance before & after {.unnumbered} This view looks at the same indexed time periods as the average attendance before & after plot shown above, but here children are categorised according to attendance brackets. For each time period (again, relative to the involvement start or closure date) the % of children in each attendance category is calculated: 0 - 50% 50 - 80% 80 - 90% 90 - 100% Also included here are several important other categories: "pre-school" - children who have not yet reached school age at the time period in question "leaving age" - those who have completed Y11 and left school at the time period in question "COVID" - half term periods that fell during COVID lockdowns, the attendance data for which is incomplete or unreliable "out of scope" - time periods that have not elapsed yet or for which data is not yet available. These are categorised but excluded from the % calculations used in the plots. This can reveal patterns of change in the distribution of attendance that lies behind the averages. Here's an example categorised stacked bar plot, in this case for Attendance Advice ```{r} #| label: categorised stack plot example stack_data <- involvements |> filter(cohort_name == "Attendance Advice") |> select(stud_id,starts_with("attend_cat")) |> pivot_longer(-stud_id,names_to = "index", values_to = "attend_cat") |> mutate(index = str_replace(index,"attend_cat_","")) |> mutate(index = str_replace(index,"start_","")) |> mutate(index = str_replace(index,"_"," ")) |> mutate(index = factor(index, levels = c("prior 12","prior 11","prior 10","prior 9","prior 8","prior 7","prior 6","prior 5","prior 4","prior 3","prior 2","prior 1","post 1","post 2","post 3","post 4","post 5","post 6","post 7","post 8","post 9","post 10","post 11","post 12"))) |> group_by(index, attend_cat) |> filter(attend_cat != "out of scope") |> #filter(!attend_cat %in% c("out of scope","preschool","leaving age","covid")) |> tally() |> mutate(freq = n / sum(n)) |> filter(!is.na(index)) |> mutate(attend_cat = factor(attend_cat, levels = c("NA","0 - 50%","50 - 80%","80 - 90%","90 - 100%","preschool","leaving age","covid","out of scope"))) ggplot(stack_data, aes(x = index, y = freq, label = freq, fill = attend_cat, group = attend_cat)) + geom_col(position = position_stack(reverse = TRUE)) + #geom_text(position = position_stack(reverse = TRUE, vjust = 0.5), size = 2.5, angle = 90) + theme(axis.title.y = eb, legend.position = "right", axis.text.x = element_text(angle = 90, vjust = 0.5)) + scale_fill_manual(values = stack_colours, guide = guide_legend(reverse = TRUE)) + scale_y_continuous(labels = scales::percent) + labs(title = "Categorised attendance before & after - Attendance Advice", subtitle = "Percentage of pupils in each category", x = "half term index (relative to involvement start date)", caption = "data from Capita One", fill = "") + geom_vline(xintercept = 12.5, linetype = "dashed", colour = "gray40") + annotate(geom = "text", label = str_wrap("inv start", width = 10), y = 1.05, x = 13.5, colour = "gray40", size = 2.5) + coord_cartesian(clip = "off") ``` ### Coded absence reasons {.unnumbered} Comparing levels of difference coded absence reasons before & after involvement may reveal changes in patterns of behaviour. We can do this at the level of the year prior and year post involvement, or at the level of the half term prior and post involvements.When looking at the year prior & post we restrict the data to only children who have at least 3 half terms' worth of data available in each of those one year periods. As an example, here is the coded absence before & after plot for the 'Current EHCP' involvement: ```{r} #| label: plot before & after by reason code example Current EHCP #| fig-height: 4 before_after_year_fun(cohort_name_selection = "Current EHCP") |> ggplot(aes(x = name, y = mean, fill = prior_post, label = scales::percent(mean, accuracy = 0.01))) + geom_col(position = "dodge") + geom_errorbar(aes(ymax = upper_ci,ymin = lower_ci), width = 0.2, alpha = 0.5, position = position_dodge(width = 0.9), colour = "gray20") + geom_text(position = position_dodge(width = 0.9), vjust = -0.2, size = 3) + scale_fill_met_d("Egypt") + scale_colour_met_d("Egypt") + scale_y_continuous(labels = scales::percent) + barplottheme_minimal + labs(title = "Absence reasons before and after Current EHCP involvement", subtitle = "Average % of sessions missed one year either side of involvement or episode start date", caption = "data from Capita One") + theme(plot.title = element_markdown(size = 12), legend.position = "none", axis.title = eb, axis.text.y = eb) + scale_x_discrete(labels = function(x) str_wrap(x, width = 2)) ``` ### With & without analysis {.unnumbered} Comparing the attendance levels (and coded absence reasons) of groups who had the intervention with those who did not. This is more problematic because finding suitable comparator groups is difficult, but useful for interventions where no prior period data is available - such as the readiness for school team, or 0-5 SEND team. Here's an example 'with & without' plot, in this case plotting attendance from year 1 through to year 6 for those with and without the Inclusion & Attendance Y4 team involvement: ```{r} #| label: plot primary attendance with and without example #| fig-height: 4 y4_inv_stud_id <- involvements |> filter(cohort_name == "I&A - Y4") |> select(stud_id) |> distinct() |> mutate(y4_team = TRUE) attend_pri <- attend |> filter( ncy %in% c(1,2,3,4,5,6) & year >= 2021 ) y4_primary <- attend_pri |> left_join(y4_inv_stud_id, by = "stud_id") |> mutate(y4_team = if_else(is.na(y4_team),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","y4_team")) |> group_by(ncy,y4_team) |> summarise_avg() |> mutate(category = "overall") ggplot(y4_primary, aes(x = ncy, y = mean.percent_present, colour = y4_team, group = y4_team) ) + geom_point() + geom_line(alpha = 0.7 ) + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2)+ geom_vline(aes(xintercept = 6.5), linetype = "dotted", colour = "gray") + labs(title = "Attendance through primary - pupils without and with I&A Y4 involvement", subtitle = "Average percentage of available sessions attended since 2021; +- 95 CI", caption = "data from Capita One") + barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(labels = seq(1,6), breaks = seq(1,6)) + theme(axis.title.x = eb, legend.position = "none", strip.background = eb, strip.text = element_text(size = 8), plot.title = element_markdown(size = 12)) + MetBrewer::scale_fill_met_d("Egypt") ``` ## Random sampling To contextualise the above, we can compare the outputs of analysis as described above with a similar analysis for a randomly selected children with no involvement. Before we get into the analysis of the actual involvements, it's worth looking at a couple of these. Firstly, a random sample average trajectory plot. Here we've taken 200 random children who consistently hit below 75% (for 3 or more half terms in a row), since 2021, and who did not have any involvements. ```{r} #| label: plot example random sample trajectory plot cohort_name_selection = "consistent_below 0.75_2021_onwards sample 1" ggplot(random_samples_long |> filter(cohort_name == cohort_name_selection), aes(x = term_index, y = mean.percent_present, colour = cohort_name )) + geom_point(colour = "steel blue") + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Attendance before and after analysis - attendance constently <75% sample 2", subtitle = "Avg % of sessions attended +- 95 CI, random sample of 200 pupils with 3 consecutive half terms below 50%", caption = "data from Capita One", x = "half term (relative to consistent absence period)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") #+ #geom_hline(yintercept = attend_2023_avg, linetype = "dashed", colour = "dark grey") + #geom_text(aes(x = 4, label = "2023 avg", y = attend_2023_avg), size = 2.5, vjust = 1, colour = "dark grey") remove(cohort_name_selection) ``` These random samples often show better apparent recovery than many of the involvements. The best explanation for this is *regression to the mean*, the idea that when conditions are in a more unfavourable state, they will often revert back to a less unfavourable state, even in the absence of any intervention. This can lead, for example, to people concluding that the medicine that they began taking when they were feeling particularly ill was the cause of their recovery, when in fact they would have recovered naturally without the medicine. Next we'll look at the *categorised attendance* of a random sample, before & after a random date. When evaluating involvments we will typically remove the pre-school, COVID, leaving age and out of scope categores, but they're included here. The profile of these plots is usually very different for children with involvements. ```{r} #| label: plot random sample categorised attendance #random_samples |> select(cohort_name) |> distinct() stack_data <- random_samples |> ungroup() |> filter(cohort_name == "all_time sample 2") |> select(stud_id,starts_with("attend_cat")) |> pivot_longer(-stud_id,names_to = "index", values_to = "attend_cat") |> mutate(index = str_replace(index,"attend_cat_","")) |> mutate(index = str_replace(index,"start_","")) |> mutate(index = str_replace(index,"_"," ")) |> mutate(index = factor(index, levels = c("prior 12","prior 11","prior 10","prior 9","prior 8","prior 7","prior 6","prior 5","prior 4","prior 3","prior 2","prior 1","post 1","post 2","post 3","post 4","post 5","post 6","post 7","post 8","post 9","post 10","post 11","post 12"))) |> group_by(index, attend_cat) |> #filter(attend_cat != "out of scope") |> #filter(!attend_cat %in% c("out of scope","preschool","leaving age","covid")) |> tally() |> mutate(freq = n / sum(n)) |> filter(!is.na(index)) |> mutate(attend_cat = factor(attend_cat, levels = c("NA","0 - 50%","50 - 80%","80 - 90%","90 - 100%","preschool","leaving age","covid","out of scope"))) ggplot(stack_data, aes(x = index, y = freq, label = freq, fill = attend_cat, group = attend_cat)) + geom_col(position = position_stack(reverse = TRUE)) + #geom_text(position = position_stack(reverse = TRUE, vjust = 0.5), size = 2.5, angle = 90) + theme(axis.title.y = eb, legend.position = "right", axis.text.x = element_text(angle = 90, vjust = 0.5)) + scale_fill_manual(values = stack_colours, guide = guide_legend(reverse = TRUE)) + scale_y_continuous(labels = scales::percent) + labs(title = "Categorised attendance of a random sample of 200 pupils, before & after random date", subtitle = "Percentage of pupils in each category", x = "half term index (relative to involvement start date)", caption = "data from Capita One", fill = "") + geom_vline(xintercept = 12.5, linetype = "dashed", colour = "gray40") + annotate(geom = "text", label = str_wrap("inv start", width = 10), y = 1.05, x = 13.5, colour = "gray40", size = 2.5) + coord_cartesian(clip = "off") ``` Finally we'll look at an example *codified reasons* of a random sample. These four plots each show the year on year change for a random sample of 200 children, based on a randomly chosen date since 2021. ```{r} #| label: plot random sample codified reasons #| fig-height: 8 cohort_name_selection <- c("post_pandemic sample 1", "post_pandemic sample 5", "post_pandemic sample 8", "post_pandemic sample 10") mean <- random_samples_long_year |> filter(cohort_name %in% cohort_name_selection) |> ungroup() |> select(cohort_name, year_cat, contains("mean") & contains("percent")) |> mutate(year_cat = factor(year_cat, levels = c("prior","post"))) |> pivot_longer(-c(year_cat,cohort_name), values_to = "mean", names_to = "name") |> mutate(name = str_replace(name, "mean.","")) |> mutate(name = str_replace(name, "percent","")) |> mutate(name = case_when(name == "_late_pres" ~ "late present", TRUE ~ str_trim(str_replace_all(name,"_"," ")))) upper_ci <- random_samples_long_year |> filter(cohort_name %in% cohort_name_selection) |> ungroup() |> select(cohort_name, year_cat, contains("upper.ci") & contains("percent")) |> mutate(year_cat = factor(year_cat, levels = c("prior","post"))) |> pivot_longer(-c(year_cat,cohort_name), values_to = "upper_ci", names_to = "name") |> mutate(name = str_replace(name, "upper.ci.","")) |> mutate(name = str_replace(name, "percent","")) |> mutate(name = case_when(name == "_late_pres" ~ "late present", TRUE ~ str_trim(str_replace_all(name,"_"," ")))) lower_ci <- random_samples_long_year |> filter(cohort_name %in% cohort_name_selection) |> ungroup() |> select(cohort_name, year_cat, contains("lower.ci") & contains("percent")) |> mutate(year_cat = factor(year_cat, levels = c("prior","post"))) |> pivot_longer(-c(year_cat,cohort_name), values_to = "lower_ci", names_to = "name") |> mutate(name = str_replace(name, "lower.ci.","")) |> mutate(name = str_replace(name, "percent","")) |> mutate(name = case_when(name == "_late_pres" ~ "late present", TRUE ~ str_trim(str_replace_all(name,"_"," ")))) # join and plot left_join(mean,upper_ci, by = c("name","year_cat","cohort_name")) |> left_join(lower_ci, by = c("name","year_cat","cohort_name")) |> filter(name != "present") |> ggplot(aes(x = name, y = mean, fill = year_cat, label = scales::percent(mean, accuracy = 0.01))) + geom_col(position = "dodge") + geom_errorbar(aes(ymax = upper_ci,ymin = lower_ci), width = 0.2, alpha = 0.5, position = position_dodge(width = 0.9), colour = "gray20") + geom_text(position = position_dodge(width = 0.9), vjust = -0.2, size = 3) + scale_fill_met_d("Egypt") + scale_colour_met_d("Egypt") + scale_y_continuous(labels = scales::percent) + facet_wrap(vars(cohort_name)) + barplottheme_minimal + labs(title = "Absence reasons one year prior and one year post - random samples", subtitle = "Random samples of 200 children, analysis pre & post a random date after 2021", caption = "data from Capita One") + theme(plot.title = element_markdown(size = 12), legend.position = "none", axis.title = eb, axis.text.y = eb, axis.text = element_text(size = 8), strip.background = eb) + scale_x_discrete(labels = function(x) str_wrap(x, width = 2)) ``` The takeaway from these 4 charts is that year on year increase in 'no reason' absences of about 1% is to be expected in many cases, as are small variations in the other coded reasons. Large chages beyond this are likely behaviour of the cohort receiving the involvment and/or a result of the involvement itself. # Services & Interventions The rest of this report details the findings on effectiveness, broadly following the methodology given above. We cover the following services, interventions & events, and aside from the child social care episodes (CIN/CPP/CLA) which are taken together, we'll cover in order of decreasing volumes of activity (based on a count of children receiving each during 2024). ```{r} #| label: table of involvement types by cohort name & count #| layout: [45, -10, 45] #cols_label( # contains("count") ~ "count", # contains("percent_of_pupils") ~ "% of pupils", # contains("percent_absent") ~ "% absent 2023/24" # ) |> involvements |> filter(exam_year_inv_start == 2024) |> group_by(cohort_name) |> summarise(count = n_distinct(stud_id)) |> arrange(desc(count)) |> mutate(rownum = row_number()) |> filter(rownum <= max(rownum)/2) |> select(-rownum) |> gt() |> cols_label( contains("count") ~ "count", contains("cohort_name") ~ "Involvement" ) involvements |> filter(exam_year_inv_start == 2024) |> group_by(cohort_name) |> summarise(count = n_distinct(stud_id)) |> arrange(desc(count)) |> mutate(rownum = row_number()) |> filter(rownum > max(rownum)/2) |> select(-rownum) |> gt() |> cols_label( contains("count") ~ "count", contains("cohort_name") ~ "Involvement" ) ``` ## Family Intervention Service Family intervention Service (FIS), formerly known as MAST. This service is targetted at families rather than children, but in our data model we mapped FIS activity to the records of about 3000 children per year: ```{r} #| label: FIS volumes & age on start #| out-width: 75% #| fig-height: 3.5 #| fig-ncol: 2 involvements |> filter(cohort_name == "MAST", exam_year_inv_start >= 2019, exam_year_inv_start <= 2024) |> group_by(exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "Family Intervention Service - volumes", subtitle = "involvements starting by accademic year", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2025)) + coord_cartesian(clip = "off") fis_inv_stud_id <- involvements |> filter(cohort_name == "MAST") |> select(stud_id) |> unique() fis_age <- involvements |> filter(cohort_name == "MAST") |> select(stud_id, cohort_name, open_date) |> left_join(attend |> filter(stud_id %in% fis_inv_stud_id$stud_id) |> select(stud_id, dob) |> unique(), by = "stud_id" ) |> mutate(age = floor(as.numeric( (open_date - dob)/365.25))) |> group_by(age) |> summarise(child_count = n_distinct(stud_id)) ggplot(fis_age |> filter(age >= 0, age <= 20), aes(x = age, y = child_count, label = child_count)) + geom_col(fill = "steelblue") + geom_text(colour = "steelblue", size = 2, vjust = -1)+ theme(axis.text.y = eb, axis.title.y = eb, axis.line.y = eb, axis.ticks.y = eb) + labs(title = "Age on involvement start - Family Intervention Service", x = "age on involvement open date") + coord_cartesian(clip = "off") ``` Immediately prior to FIS involvement we see a large drop in average attendance - perhaps this reflects some family crisis that prompts the services involvement, and also affects school attendance? The involvement is associated with an immediate increase in attendance, and a change in the direction of travel, with attendance steadily improving term-on-term once the team is involved. Although the overall net year on year change is still negative - attendance does not recover to prior levels. ```{r} #| label: plot FIS average attendance before and after inv_summary_long_start |> filter(cohort_name == "MAST") |> ggplot(aes(x = term_index, y = mean.percent_present ) ) + geom_point(colour = "steel blue", size = 2) + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Average attendance before and after Family Intervention Service", subtitle = "Mean percentage of available sessions attended +- 95 CI", caption = "data from Capita One", x = "half term (relative to start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") ``` Exclusion rates remain high for children involved with FIS: ```{r} #| label: plot FIS exclusions #| fig-height: 3.5 inv_summary_long_start |> filter(cohort_name == "MAST") |> ggplot(aes(x = term_index, y = mean.percent_excluded) ) + geom_point(colour = "dark red") + geom_errorbar(aes(ymin = lower.ci.percent_excluded, ymax = upper.ci.percent_excluded), width = 0.2, colour = "dark red")+ labs(title = "Exclusion before and after intervention - Family Intervention Service", subtitle = "Average percentage of available sessions where the child was excluded +- 95 CI", x = "half term (relative to involvement start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") + geom_hline(yintercept = excl_2023_avg, linetype = "dashed", colour = "dark grey") + geom_text(aes(x = 4, label = paste("2023 avg",scales::percent(excl_2023_avg, accuracy = 0.1L)), y = excl_2023_avg), size = 2.5, vjust = 1, colour = "dark grey") ``` The categorised reasons also show the immediate effect of FIS involvement, with an increase in the highest attendance category, and a drop in the lowest bracket: ```{r} #| label: categorised stack plot FIS # stacked bar of % category stack_data <- involvements |> filter(cohort_name == "MAST") |> select(stud_id,starts_with("attend_cat")) |> pivot_longer(-stud_id,names_to = "index", values_to = "attend_cat") |> mutate(index = str_replace(index,"attend_cat_","")) |> mutate(index = str_replace(index,"start_","")) |> mutate(index = str_replace(index,"_"," ")) |> mutate(index = factor(index, levels = c("prior 12","prior 11","prior 10","prior 9","prior 8","prior 7","prior 6","prior 5","prior 4","prior 3","prior 2","prior 1","post 1","post 2","post 3","post 4","post 5","post 6","post 7","post 8","post 9","post 10","post 11","post 12"))) |> group_by(index, attend_cat) |> filter(attend_cat != "out of scope") |> filter(!attend_cat %in% c("out of scope","preschool","leaving age","covid")) |> tally() |> mutate(freq = n / sum(n)) |> filter(!is.na(index)) |> mutate(attend_cat = factor(attend_cat, levels = c("NA","0 - 50%","50 - 80%","80 - 90%","90 - 100%","preschool","leaving age","covid","out of scope"))) ggplot(stack_data, aes(x = index, y = freq, label = freq, fill = attend_cat, group = attend_cat)) + geom_col(position = position_stack(reverse = TRUE))+ #geom_text(position = position_stack(reverse = TRUE, vjust = 0.5), size = 2.5, angle = 90) + theme(axis.title.y = eb, legend.position = "right", axis.text.x = element_text(angle = 90, vjust = 0.5)) + scale_fill_manual(values = stack_colours, guide = guide_legend(reverse = TRUE)) + scale_y_continuous(labels = scales::percent) + labs(title = "Categorised attendance before & after - Family Intervention Service", subtitle = "Percentage of pupils in each category", x = "half term index (relative to involvement start date)", caption = "data from Capita One", fill = "") + geom_vline(xintercept = 12.5, linetype = "dashed", colour = "gray40") + annotate(geom = "text", label = str_wrap("inv start", width = 10), y = 1.05, x = 13.5, colour = "gray40", size = 2.5) + coord_cartesian(clip = "off") ``` Since the significant change following FIS involvement happens at the half term level, we'll look at coded reasons at the half term level. Here we see a decrease across all coded absences, especially in illness rates. ```{r} #| label: plot before & after by reason code FIS #| fig-height: 4 before_after_ht_fun(cohort_name_selection = "MAST") |> ggplot(aes(x = name, y = mean, fill = prior_post, label = scales::percent(mean, accuracy = 0.01))) + geom_col(position = "dodge") + geom_errorbar(aes(ymax = upper_ci,ymin = lower_ci), width = 0.2, alpha = 0.5, position = position_dodge(width = 0.9), colour = "gray20") + geom_text(position = position_dodge(width = 0.9), vjust = -0.2, size = 3) + scale_fill_met_d("Egypt") + scale_colour_met_d("Egypt") + scale_y_continuous(labels = scales::percent) + barplottheme_minimal + labs(title = "Absence reasons before and after Family Intervention Service", subtitle = "Average % of sessions missed one half term either side of involvement or episode start date", caption = "data from Capita One") + theme(plot.title = element_markdown(size = 12), legend.position = "none", axis.title = eb, axis.text.y = eb) + scale_x_discrete(labels = function(x) str_wrap(x, width = 2)) ``` ```{r} # get FIS categorised data # get involvements with available data fis_available <- involvements |> ungroup() |> filter(cohort_name == "MAST") |> #select(involvement_id, present_prior_1, present_prior_2, present_prior_3) |> rowwise() |> mutate(prior_data_available = ( if_else(!is.na(percent_present_prior_1),1,0) #+ # if_else(!is.na(percent_present_prior_2),1,0) + # if_else(!is.na(percent_present_prior_3),1,0) + # if_else(!is.na(percent_present_prior_4),1,0) + # if_else(!is.na(percent_present_prior_5),1,0) + # if_else(!is.na(percent_present_prior_6),1,0) ), post_data_available = ( if_else(!is.na(percent_present_start_post_1),1,0) # + # if_else(!is.na(percent_present_start_post_2),1,0) + # if_else(!is.na(percent_present_start_post_3),1,0) + # if_else(!is.na(percent_present_start_post_4),1,0) + # if_else(!is.na(percent_present_start_post_5),1,0) + # if_else(!is.na(percent_present_start_post_6),1,0) )) |> filter(prior_data_available >= 1, # 3, post_data_available >= 1) |> # 3) |> select(involvement_id) fis_categorised <- involvements |> filter(involvement_id %in% fis_available$involvement_id) |> select(involvement_id, stud_id, year = exam_year_inv_start, age_on_inv_start, open_date, prior = percent_present_prior_1, #percent_present_prior_year, post = percent_present_start_post_1 #percent_present_start_post_year ) |> mutate(diff = post - prior) |> mutate(diff_cat = case_when(diff == 0 ~ "no change", diff > 0 ~ "increase", TRUE ~ "decrease") ) |> filter(!is.na(diff)) |> left_join(stud_details_joined, by = "stud_id") |> ungroup() ``` Finally, we look at how the half-term to half-term shifts in attendance vary by some chosen characteristics: - Younger children see a greater net improvement than older children; those going into Y7 and Y11 see a net reduction - girls see a bigger improvement than boys - Deprivation makes a difference with children in more deprived wards seeing a bigger increase in attendance - but only to a point ```{r} #| label: plot FIS change by characteristics #| layout-ncol: 2 #| layout-nrow: 2 #| fig-height: 6 #| warning: false fis_categorised |> left_join(attend_stud_year_ncy |> select(stud_id,year, ncy), by = c("stud_id","year")) |> filter(ncy < 12, ncy > 0) |> group_by(ncy) |> summarise_before_after() |> ungroup() |> ggplot(aes(x = ncy, y = mean.diff, label = scales::percent(mean.diff))) + geom_col(fill = "steel blue") + geom_text(colour = "steel blue", size = 3, aes(vjust = if_else(mean.diff < 0,1,-1))) + geom_errorbar(aes(ymin = lower.ci.diff, ymax = upper.ci.diff), width = 0.2, colour = "gray40", alpha = 0.7) + scale_x_continuous(breaks = seq(0,11)) + labs(title = "Attendance pre and post Family Intervention Service, by NCY on involvement start", subtitle = "Mean % of sessions attended +/- 1 yr from involvement start, only those with attendance data available for at least 3 of 6 half terms either side", caption = "data from Capita One") + barplottheme_minimal + theme(axis.text.y = eb) # plot change by year of involvement start # taking this out as it almost certainly results from background changes in attendance rather than changes in the impact of the service # fis_categorised |> # filter(year >= 2010, year <= 2024) |> # group_by(year) |> # summarise_before_after() |> # ungroup() |> # ggplot(aes(x = year, # y = mean.diff, # label = scales::percent(mean.diff, accuracy = 0.1))) + # geom_col(fill = "steel blue") + # geom_text(colour = "steel blue", size = 3, aes(vjust = if_else(mean.diff < 0,1,-1))) + # geom_errorbar(aes(ymin = lower.ci.diff, ymax = upper.ci.diff), width = 0.2, colour = "gray40", alpha = 0.7) + # #scale_x_continuous(breaks = seq(1,11)) + # labs(title = "Mean change in annual attendance following FIS by year of involvement start", # subtitle = "Involvements with attendance data available for at least 3 of 6 half terms either side of the open date", # caption = "data from Capita One") + # barplottheme_minimal + # theme(axis.text.y = eb) # plot change by gender fis_categorised |> group_by(gender) |> summarise_before_after() |> ungroup() |> ggplot( aes(x = gender, y = mean.diff, label = scales::percent(mean.diff))) + geom_col(fill = "steel blue", width = 0.75) + geom_text(colour = "steel blue", size = 3, aes(vjust = if_else(mean.diff < 0,1,-1))) + geom_errorbar(aes(ymin = lower.ci.diff, ymax = upper.ci.diff), width = 0.2, colour = "gray40") + labs(title = "Mean change in annual attendance following Family Intervention Service, by gender", subtitle = "Involvements with attendance data available for at least 3 of 6 half terms either side of the open date", caption = "data from Capita One") + barplottheme_minimal + theme(axis.text.y = eb) # plot by imd quartile fis_categorised |> group_by(imd_quartile) |> summarise_before_after() |> ungroup() |> ggplot( aes(x = imd_quartile, y = mean.diff, label = scales::percent(mean.diff))) + geom_col(fill = "steel blue") + geom_text(colour = "steel blue", size = 3, aes(vjust = if_else(mean.diff < 0,1,-1))) + geom_errorbar(aes(ymin = lower.ci.diff, ymax = upper.ci.diff), width = 0.2, colour = "gray40") + labs(title = "Average change in attendance following Family Intervention Service, by IMD quartile", subtitle = "1 = most affluent wards; 4 = most deprived wards") + barplottheme_minimal + theme(axis.text.y = eb) ``` ## Educational Psychology This involvement marks the child's placement on the caseload of an Educational Psychologist. There are around a thousand of these in the city each year: ```{r} #| label: educational psychology advice volumes #| out-width: 75% involvements |> filter(cohort_name == "Ed. Psych.") |> group_by(exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "Educational Psychology - volumes", subtitle = "involvements starting by accademic year; 2025 is part year only", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2025)) + coord_cartesian(clip = "off") ``` ```{r} #| label: educational psychology advice age on inv start #| out-width: 75% ed_psych_inv_stud_id <- involvements |> filter(cohort_name == "Ed. Psych.") |> select(stud_id) |> unique() ed_psych_age <- involvements |> filter(cohort_name == "Ed. Psych.") |> select(stud_id, cohort_name, open_date) |> left_join(attend |> filter(stud_id %in% ed_psych_inv_stud_id$stud_id) |> select(stud_id, dob) |> unique(), by = "stud_id" ) |> mutate(age = floor(as.numeric( (open_date - dob)/365.25))) |> group_by(age) |> summarise(child_count = n_distinct(stud_id)) ggplot(ed_psych_age, aes(x = age, y = child_count)) + geom_col(fill = "steel blue") + theme(axis.text.y = eb, axis.title.y = eb, axis.line.y = eb, axis.ticks.y = eb) + labs(title = "Age on involvement start - Educational Psychology", x = "age on involvement open date") + xlim(limits = c(0,20)) ``` The majority of these are for children who require SEN support, and those with speech, language and communication needs: ```{r} #| label: table of ed psych count by sen level and primary specific need #| layout: [45, -10, 45] # by SEN level involvements |> filter( exam_year_inv_start == 2024, cohort_name == "Ed. Psych.") |> group_by( #sen_need_category sen_level ) |> summarise(child_count = n_distinct(stud_id)) |> arrange(desc(child_count)) |> gt(rowname_col = "sen_level") |> tab_header( title = "Educational Psychology", subtitle = "counts of children with involvements in 2024, by SEN level" ) |> tab_options( table.align = "left", heading.title.font.size = 12, heading.subtitle.font.size= 10, heading.align = "left", column_labels.font.size = 14, stub.font.size = 12, column_labels.hidden = TRUE ) |> cols_align("left",'sen_level') # by primary specific need involvements |> filter( exam_year_inv_start == 2024, cohort_name == "Ed. Psych.") |> group_by( primary_specific_need ) |> summarise(child_count = n_distinct(stud_id)) |> arrange(desc(child_count)) |> gt(rowname_col = "primary_specific_need") |> tab_header( title = "Educational Psychology", subtitle = "counts of children with involvements in 2024, by primary specific need" ) |> tab_options( table.align = "left", heading.title.font.size = 12, heading.subtitle.font.size= 10, heading.align = "left", column_labels.font.size = 14, stub.font.size = 12, column_labels.hidden = TRUE ) |> cols_align("left",'primary_specific_need') ``` The two year run up to the start of an Educational Psychology involvement shows a quite steep decline in average attendance, and the involvement startis associated with a turnaround in the direction of travel. ```{r} #| label: plot educational psychology average attendance inv_summary_long_start |> filter(cohort_name == "Ed. Psych.") |> ggplot(aes(x = term_index, y = mean.percent_present ) ) + geom_point(colour = "steel blue", size = 2) + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Average attendance before and after intervention - Educational Psychology", subtitle = "Mean percentage of available sessions attended +- 95 CI", caption = "data from Capita One", x = "half term (relative to involvement start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") ``` Exclusion rates show a rapid and sustained drop following Educational Psychology involvement: ```{r} #| label: plot Ed Psych exclusions #| fig-height: 3.5 inv_summary_long_start |> filter(cohort_name == "Ed. Psych.") |> ggplot(aes(x = term_index, y = mean.percent_excluded) ) + geom_point(colour = "dark red") + geom_errorbar(aes(ymin = lower.ci.percent_excluded, ymax = upper.ci.percent_excluded), width = 0.2, colour = "dark red")+ labs(title = "Exclusion before and after intervention - Educational Psychology", subtitle = "Average percentage of available sessions where the child was excluded +- 95 CI", x = "half term (relative to involvement start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") + geom_hline(yintercept = excl_2023_avg, linetype = "dashed", colour = "dark grey") + geom_text(aes(x = 4, label = paste("2023 avg",scales::percent(excl_2023_avg, accuracy = 0.1L)), y = excl_2023_avg), size = 2.5, vjust = 1, colour = "dark grey") ``` The coded reasons in the year before & after educational psychology doesn't reveal any movement beyond what we see in the random sampling, so is not included here. The categorised attendance analysis also doesn't reveal any significant impacts of Educational Psychologist involvement. Finally, we also looked at a 'with & without' analysis, tracking children with various SEN levels and needs, and children in deprived areas, but across all of these variables children with the Ed. Psych. invovlement consistently attend lower than those without it - and clearly there are factors at work beyond what we have access to in the data. ```{r} #| label: plot before & after by reason code for ed psych #| fig-height: 3 #| eval: false before_after_ht_fun(cohort_name_selection = "Ed. Psych.") |> ggplot(aes(x = name, y = mean, fill = prior_post, label = scales::percent(mean, accuracy = 0.01))) + geom_col(position = "dodge") + geom_errorbar(aes(ymax = upper_ci,ymin = lower_ci), width = 0.2, alpha = 0.5, position = position_dodge(width = 0.9), colour = "gray20") + geom_text(position = position_dodge(width = 0.9), vjust = -0.2, size = 3) + scale_fill_met_d("Egypt") + scale_colour_met_d("Egypt") + scale_y_continuous(labels = scales::percent) + barplottheme_minimal + labs(title = "Absence reasons before and after Educational Psychology", subtitle = "Average % of sessions missed one year either side of involvement or episode start date", caption = "data from Capita One") + theme(plot.title = element_markdown(size = 12), legend.position = "none", axis.title = eb, axis.text.y = eb) ``` ```{r} #| label: categorised stack plot ed psych #| eval: false stack_data <- involvements |> filter(cohort_name == "Ed. Psych.") |> select(stud_id,starts_with("attend_cat")) |> pivot_longer(-stud_id,names_to = "index", values_to = "attend_cat") |> mutate(index = str_replace(index,"attend_cat_","")) |> mutate(index = str_replace(index,"start_","")) |> mutate(index = str_replace(index,"_"," ")) |> mutate(index = factor(index, levels = c("prior 12","prior 11","prior 10","prior 9","prior 8","prior 7","prior 6","prior 5","prior 4","prior 3","prior 2","prior 1","post 1","post 2","post 3","post 4","post 5","post 6","post 7","post 8","post 9","post 10","post 11","post 12"))) |> group_by(index, attend_cat) |> filter(attend_cat != "out of scope") |> filter(!attend_cat %in% c("out of scope","preschool","leaving age","covid")) |> tally() |> mutate(freq = n / sum(n)) |> filter(!is.na(index)) |> mutate(attend_cat = factor(attend_cat, levels = c("NA","0 - 50%","50 - 80%","80 - 90%","90 - 100%","preschool","leaving age","covid","out of scope"))) ggplot(stack_data, aes(x = index, y = freq, label = freq, fill = attend_cat, group = attend_cat)) + geom_col(position = position_stack(reverse = TRUE))+ #geom_text(position = position_stack(reverse = TRUE, vjust = 0.5), size = 2.5, angle = 90) + theme(axis.title.y = eb, legend.position = "right", axis.text.x = element_text(angle = 90, vjust = 0.5)) + scale_fill_manual(values = stack_colours, guide = guide_legend(reverse = TRUE)) + scale_y_continuous(labels = scales::percent) + labs(title = "Categorised attendance before & after - Educational Psychology", subtitle = "Percentage of pupils in each category", x = "half term index (relative to involvement start date)", caption = "data from Capita One", fill = "") + geom_vline(xintercept = 12.5, linetype = "dashed", colour = "gray40") + annotate(geom = "text", label = str_wrap("inv start", width = 10), y = 1.05, x = 13.5, colour = "gray40", size = 2.5) + coord_cartesian(clip = "off") ``` ```{r} #| label: ed_psych effectiveness by characteristic group #| eval: false #| fig-height: 8 ed_psych_stud_id <- involvements |> filter(cohort_name == "Ed. Psych.") |> select(stud_id) |> distinct() |> mutate(ed_psych = TRUE) attend_ed_psych <- attend |> filter( ncy %in% c(0,1,2,3,4,5,6,7,8) & year >= 2021 ) ed_psych_overall <- attend_ed_psych |> left_join(ed_psych_stud_id, by = "stud_id") |> mutate(ed_psych = if_else(is.na(ed_psych),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","ed_psych")) |> group_by(ncy,ed_psych) |> summarise_avg() |> mutate(category = "overall") ed_psych_deprived <- attend_ed_psych |> left_join(ed_psych_stud_id, by = "stud_id") |> filter(ward_imd_score >= 40 ) |> mutate(ed_psych = if_else(is.na(ed_psych),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","ed_psych")) |> group_by(ncy,ed_psych) |> summarise_avg() |> mutate(category = "most deprived wards") ed_psych_sen_support <- attend_ed_psych |> left_join(ed_psych_stud_id, by = "stud_id") |> filter(sen_level == "SEN Support" ) |> mutate(ed_psych = if_else(is.na(ed_psych),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","ed_psych")) |> group_by(ncy,ed_psych) |> summarise_avg() |> mutate(category = "SEN Support") ed_psych_ehcp <- attend_ed_psych |> left_join(ed_psych_stud_id, by = "stud_id") |> filter(sen_level == "EHCP" ) |> mutate(ed_psych = if_else(is.na(ed_psych),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","ed_psych")) |> group_by(ncy,ed_psych) |> summarise_avg() |> mutate(category = "EHCP") |> filter(ncy > 0) # this row contains a huge error margin ed_psych_slcn <- attend_ed_psych |> left_join(ed_psych_stud_id, by = "stud_id") |> filter(primary_specific_need == "Speech, Language And Communication Needs" ) |> mutate(ed_psych = if_else(is.na(ed_psych),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","ed_psych")) |> group_by(ncy,ed_psych) |> summarise_avg() |> mutate(category = "Speech, Language And Communication Needs") ed_psych_asd <- attend_ed_psych |> left_join(ed_psych_stud_id, by = "stud_id") |> filter(primary_specific_need == "Autism" ) |> mutate(ed_psych = if_else(is.na(ed_psych),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","ed_psych")) |> group_by(ncy,ed_psych) |> summarise_avg() |> mutate(category = "Autism") rbind( ed_psych_overall, ed_psych_sen_support, ed_psych_deprived, ed_psych_ehcp, ed_psych_slcn, ed_psych_asd ) |> mutate(category = factor(category, levels = c("overall", "most deprived wards", "SEN Support", "EHCP", "Speech, Language And Communication Needs", "Autism") )) |> ggplot(aes(x = ncy, y = mean.percent_present, colour = ed_psych, group = ed_psych)) + geom_point() + geom_line(alpha = 0.7 ) + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2)+ geom_text(aes(label = scales::percent(round(mean.percent_present,3))), vjust = 2, colour = "white", size = 4, position = position_dodge(0.9)) + geom_vline(aes(xintercept = 6.5), linetype = "dotted", colour = "gray") + facet_wrap(vars(category)) + labs(title = "Attendance through years 0 to 6 - pupils without and with Educational Psychology involvement", subtitle = "Average percentage of available sessions attended since 2021; +- 95 CI", caption = "data from Capita One") + scale_y_continuous(labels = scales::percent) + scale_x_continuous(labels = seq(0,6), breaks = seq(0,6)) + barplottheme_minimal + theme(axis.title.x = eb, #axis.text.y = eb, legend.position = "none", strip.background = eb, strip.text = element_text(size = 8), plot.title = element_markdown(size = 12)) + MetBrewer::scale_fill_met_d("Egypt") remove(attend_ed_psych) ``` ## Attendance advice 844 attendance advice involvements were started in 2024: ```{r} #| label: plot attendance advice volumes #| fig-height: 3.5 involvements |> filter(cohort_name == "Attendance Advice") |> group_by(year = exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> filter(year > 2020, year < 2025) |> ggplot(aes(x = year, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -1, size = 3.5) + barplottheme_minimal + labs(title = "Attendance Advice volumes", subtitle = "involvements starting by accademic year", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2024)) + coord_cartesian(clip = "off") aa_inv_stud_id <- involvements |> filter(cohort_name == "Attendance Advice") |> select(stud_id) |> unique() ``` The age profile shows children of all ages receiving this involvement, but with a peak at ages 13 - 15. ```{r} #| label: plot attendance advice age on inv start #| fig-height: 3.5 aa_age <- involvements |> filter(cohort_name == "Attendance Advice") |> select(stud_id, cohort_name, open_date) |> left_join(attend |> filter(stud_id %in% aa_inv_stud_id$stud_id) |> select(stud_id, dob) |> unique(), by = "stud_id" ) |> mutate(age = floor(as.numeric( (open_date - dob)/365.25))) |> group_by(age) |> summarise(child_count = n_distinct(stud_id)) ggplot(aa_age |> filter(age >= 0, age <= 20), aes(x = age, y = child_count, label = child_count)) + geom_col(fill = "steelblue") + geom_text(colour = "steelblue", size = 2, vjust = -1)+ theme(axis.text.y = eb, axis.title.y = eb, axis.line.y = eb, axis.ticks.y = eb) + labs(title = "Age on involvement start - Attendance advice", x = "age on involvement open date") + xlim(limits = c(3,18)) + coord_cartesian(clip = "off") ``` Calculating the average (mean) attendance in the terms prior to, and following the opening of attendance advice involvement, we see the pattern below. Average attendance declines steeply towards the point of intervention. After the involvement starts we see a turnaround, though attendence does not recover back to levels seen two years prior (given the overall 4 year timescale here, this can perhaps be attributed to the tendency of attendance to reduce with age) ```{r} #| label: plot attendance advice average attendance inv_summary_long_start |> filter(cohort_name == "Attendance Advice") |> ggplot(aes(x = term_index, y = mean.percent_present ) ) + geom_point(colour = "steel blue", size = 2) + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Average attendance before and after intervention - Attendance Advice", subtitle = "Mean percentage of available sessions attended +- 95 CI", caption = "data from Capita One", x = "half term (relative to involvement start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") ``` The categorised attendance analysis shows more detail than the overall average, as we get steady growth in both those entirely missing 'NA' and the highest attendance bracket, while the severe absence category steadily decreases. Here those leaving school or at preschool are removed: ```{r} #| label: categorised stack plot attendance advice # stacked bar of % category stack_data <- involvements |> filter(cohort_name == "Attendance Advice") |> select(stud_id,starts_with("attend_cat")) |> pivot_longer(-stud_id,names_to = "index", values_to = "attend_cat") |> mutate(index = str_replace(index,"attend_cat_","")) |> mutate(index = str_replace(index,"start_","")) |> mutate(index = str_replace(index,"_"," ")) |> mutate(index = factor(index, levels = c("prior 12","prior 11","prior 10","prior 9","prior 8","prior 7","prior 6","prior 5","prior 4","prior 3","prior 2","prior 1","post 1","post 2","post 3","post 4","post 5","post 6","post 7","post 8","post 9","post 10","post 11","post 12"))) |> group_by(index, attend_cat) |> filter(attend_cat != "out of scope") |> filter(!attend_cat %in% c("out of scope","preschool","leaving age","covid")) |> tally() |> mutate(freq = n / sum(n)) |> filter(!is.na(index)) |> mutate(attend_cat = factor(attend_cat, levels = c("NA","0 - 50%","50 - 80%","80 - 90%","90 - 100%","preschool","leaving age","covid","out of scope"))) ggplot(stack_data, aes(x = index, y = freq, label = freq, fill = attend_cat, group = attend_cat)) + geom_col(position = position_stack(reverse = TRUE))+ #geom_text(position = position_stack(reverse = TRUE, vjust = 0.5), size = 2.5, angle = 90) + theme(axis.title.y = eb, legend.position = "right", axis.text.x = element_text(angle = 90, vjust = 0.5)) + scale_fill_manual(values = stack_colours, guide = guide_legend(reverse = TRUE)) + scale_y_continuous(labels = scales::percent) + labs(title = "Categorised attendance before & after - Attendance Advice", subtitle = "Percentage of pupils in each category", x = "half term index (relative to involvement start date)", caption = "data from Capita One", fill = "") + geom_vline(xintercept = 12.5, linetype = "dashed", colour = "gray40") + annotate(geom = "text", label = str_wrap("inv start", width = 10), y = 1.05, x = 13.5, colour = "gray40", size = 2.5) + coord_cartesian(clip = "off") ``` ```{r} #| label: prep attendance advice characteristic data # get involvements that have available data aa_available <- involvements |> ungroup() |> filter(cohort_name == "Attendance Advice") |> #select(involvement_id, present_prior_1, present_prior_2, present_prior_3) |> rowwise() |> mutate(prior_data_available = ( if_else(!is.na(percent_present_prior_1),1,0) #+ #if_else(!is.na(percent_present_prior_2),1,0) + #if_else(!is.na(percent_present_prior_3),1,0) + #if_else(!is.na(percent_present_prior_4),1,0) + #if_else(!is.na(percent_present_prior_5),1,0) + #if_else(!is.na(percent_present_prior_6),1,0) ), post_data_available = ( if_else(!is.na(percent_present_start_post_1),1,0) #+ #if_else(!is.na(percent_present_start_post_2),1,0) + #if_else(!is.na(percent_present_start_post_3),1,0) + #if_else(!is.na(percent_present_start_post_4),1,0) + #if_else(!is.na(percent_present_start_post_5),1,0) + #if_else(!is.na(percent_present_start_post_6),1,0) )) |> filter(prior_data_available >= 1, #3, post_data_available >= 1) |> #3) |> select(involvement_id) aa_categorised <- involvements |> filter(involvement_id %in% aa_available$involvement_id) |> select(involvement_id, stud_id, year = exam_year_inv_start, age_on_inv_start, open_date, prior = percent_present_prior_1, #percent_present_prior_year, post = percent_present_start_post_1 #percent_present_start_post_year ) |> mutate(diff = post - prior) |> mutate(diff_cat = case_when(diff == 0 ~ "no change", diff > 0 ~ "increase", TRUE ~ "decrease") ) |> filter(!is.na(diff)) |> left_join(stud_details_joined, by = "stud_id") ``` Another approach is to compare the relative effectiveness of attendance advice on groups of children with different characteristics. Attendance advice is associated with a reduction in attendance year on year (and this remains true across all groups) but if we look half-term to half-term there are groups who see a net improvement: 1. Attendance advice makes more of a difference in younger children. ```{r} aa_categorised |> left_join(attend_stud_year_ncy |> select(stud_id,ncy)) |> group_by(ncy) |> summarise_before_after() |> ggplot(aes(x = ncy, y = mean.diff, label = scales::percent(mean.diff, accuracy = 0.1L))) + geom_col(fill = "steel blue") + geom_text(colour = "steel blue", aes(vjust = if_else(mean.diff < 0,1,-1))) + geom_errorbar(aes(ymin = lower.ci.diff, ymax = upper.ci.diff), width = 0.2, colour = "gray40", alpha = 0.7) + scale_x_continuous(breaks = seq(1,11)) + labs(title = "Attendance pre and post Attendance Advice, by NCY on involvement start", subtitle = "Mean % of sessions attended +/- 1 yr from involvement start, only those with attendance data available for at least 3 of 6 half terms either side", caption = "data from Capita One") + barplottheme_minimal + theme(axis.text.y = eb) ``` 2. There is a big difference in gender - with girls seeing a net increase at the half term level, and boys seeing an overall _decrease_ ```{r} aa_categorised |> group_by(gender) |> summarise_before_after() |> ggplot( aes(x = gender, y = mean.diff, label = scales::percent(mean.diff, accuracy = 0.1L))) + geom_col(fill = "steel blue", width = 0.75) + geom_text(colour = "steel blue", aes(vjust = if_else(mean.diff < 0,1,-1))) + geom_errorbar(aes(ymin = lower.ci.diff, ymax = upper.ci.diff), width = 0.2, colour = "gray40") + labs(title = "Mean change in annual attendance following Attendance Advice, by gender", subtitle = "Involvements with attendance data available for at least 3 of 6 half terms either side of the open date", caption = "data from Capita One") + barplottheme_minimal + theme(axis.text.y = eb) ``` 3. There is also a trend by deprivation - children living in poorer areas of the city see a net _increase_ whereas children living in more affluent wards see a _decrease_ on average. ```{r} aa_categorised |> group_by(imd_quartile) |> summarise_before_after() |> ggplot( aes(x = imd_quartile, y = mean.diff, label = scales::percent(mean.diff, accuracy = 0.1L))) + geom_col(fill = "steel blue") + geom_text(colour = "steel blue", aes(vjust = if_else(mean.diff < 0,1,-1))) + geom_errorbar(aes(ymin = lower.ci.diff, ymax = upper.ci.diff), width = 0.2, colour = "gray40") + labs(title = "Average change in attendance following attendance advice, by IMD quartile", subtitle = "1 = most affluent wards; 4 = most deprived wards") + barplottheme_minimal + theme(axis.text.y = eb) ``` ```{r} #| label: plot attendance advice by year of involvement start #| eval: false #| echo: false # This not rendered for attendance allowance as there's no difference by year aa_categorised |> #left_join(attend_stud_year_ncy |> select(stud_id,ncy)) |> group_by(year) |> summarise_before_after() |> ggplot(aes(x = year, y = mean.diff, label = scales::percent(mean.diff))) + geom_col(fill = "steel blue") + geom_text(colour = "steel blue", aes(vjust = if_else(mean.diff < 0,1,-1))) + geom_errorbar(aes(ymin = lower.ci.diff, ymax = upper.ci.diff), width = 0.2, colour = "gray40", alpha = 0.7) + #scale_x_continuous(breaks = seq(1,11)) + labs(title = "Mean change in annual attendance following Attendance Advice by year", subtitle = "Involvements with attendance data available for at least 3 of 6 half terms either side of the open date", caption = "data from Capita One") ``` ```{r} #| label: plot attendance advice by sen specific need #| eval: false #| echo: false # This not rendered for attendance allowance as there's no significant difference aa_categorised |> filter(!is.na(primary_specific_need)) |> group_by(primary_specific_need) |> summarise_before_after() |> ggplot( aes(x = reorder(primary_specific_need,mean.diff), y = mean.diff, label = scales::percent(mean.diff))) + geom_col(fill = "steel blue") + geom_errorbar(aes(ymin = lower.ci.diff, ymax = upper.ci.diff), width = 0.2, colour = "gray60") + geom_text() + coord_flip() + labs(title = "Average change in attendance following attendance advice, by SEN specific need", subtitle = "Mean change in % of sessions attended +- 95 CI", y = "mean difference in annual attendance", caption = "data from Capita One") + barplottheme_minimal + theme(axis.title.x = eb) ``` ```{r} #| label: plot attendance advice by ethnicity description #| eval: false #| echo: false # This not rendered for attendance allowance as there's no significant difference aa_categorised |> filter(!is.na(ethnicity_description)) |> group_by(ethnicity_description) |> summarise_before_after() |> ggplot( aes(x = reorder(ethnicity_description,mean.diff), y = mean.diff, label = scales::percent(mean.diff))) + geom_col(fill = "steel blue") + geom_errorbar(aes(ymin = lower.ci.diff, ymax = upper.ci.diff), width = 0.2, colour = "gray60") + geom_text() + coord_flip() + labs(title = "Average change in attendance following attendance advice, by SEN level", subtitle = "Mean change in % of sessions attended +- 95 CI", y = "mean difference in annual attendance", caption = "data from Capita One") + barplottheme_minimal + theme(axis.title.x = eb) + labs(title = "Average change in attendance following attendance advice, by ethnicity description", subtitle = "Mean change in % of sessions attended +- 95 CI", y = "mean difference in annual attendance", caption = "data from Capita One") + barplottheme_minimal + theme(axis.title.x = eb) ``` It's worth also looking at the school level - since attendance advice is essentially a notification from the authority to the school, and the school's response to this may vary. The confidence intervals (shown as error bars here) are very wide for this data, but there are big differences between schools that may be indicative of different responses to the attendance advice intervention: ```{r} #| label: plot attendance advice by school #| fig-height: 6 # plot difference by school aa_categorised |> left_join(attend_stud_year_school |> ungroup() |> select(stud_id, school = school_name, year ), by = c("stud_id","year")) |> filter(!is.na(school)) |> left_join(school_details_short |> rename(school = school_name)) |> filter(school_ed_phase == "Secondary", school_sheffield_flag == 1 ) |> group_by(school) |> summarise_before_after() |> ggplot( aes(x = reorder(school,mean.diff), y = mean.diff, label = scales::percent(mean.diff, accuracy = 0.1L))) + geom_col(fill = "steel blue") + geom_errorbar(aes(ymin = lower.ci.diff, ymax = upper.ci.diff), width = 0.2, colour = "gray60") + geom_text() + coord_flip() + labs(title = "Average change in attendance following attendance advice, by SEN level", subtitle = "Mean change in % of sessions attended +- 95 CI", y = "mean difference in annual attendance", caption = "data from Capita One") + barplottheme_minimal + theme(axis.title.x = eb) + labs(title = "Average change in attendance following attendance advice, by school", subtitle = "Mean change in % of sessions attended +- 95 CI", y = "mean difference in annual attendance", caption = "data from Capita One") + barplottheme_minimal + theme(axis.title.x = eb) ``` Finally for Attendance Advice, looking at the coded reasons a full year either side of involvement only shows increases across various absence reasons, but there are small improvements at the half term level, mostly in terms of illness and 'no reason' absences: ```{r} #| label: plot before & after by reason code for attendance advice #| fig-height: 3.5 before_after_ht_fun(cohort_name_selection = "Attendance Advice") |> ggplot(aes(x = name, y = mean, fill = prior_post, label = scales::percent(mean, accuracy = 0.01))) + geom_col(position = "dodge") + geom_errorbar(aes(ymax = upper_ci,ymin = lower_ci), width = 0.2, alpha = 0.5, position = position_dodge(width = 0.9), colour = "gray20") + geom_text(position = position_dodge(width = 0.9), vjust = -0.2, size = 3) + scale_fill_met_d("Egypt") + scale_colour_met_d("Egypt") + scale_y_continuous(labels = scales::percent) + barplottheme_minimal + labs(title = "Absence reasons before and after Attendance Advice", subtitle = "Average % of sessions missed one year either side of involvement or episode start date", caption = "data from Capita One") + theme(plot.title = element_markdown(size = 12), legend.position = "none", axis.title = eb, axis.text.y = eb) ``` ## Reduced timetable Reduced Timetable volumes ```{r} #| label: reduced timetable volumes & age on involvement start #| out-width: 75% #| fig-ncol: 2 involvements |> filter(cohort_name == "Reduced timetable") |> group_by(exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "Reduced Timetable - volumes", subtitle = "involvements starting by accademic year; 2025 is part year only", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2025)) + coord_cartesian(clip = "off") reduced_timetable_stud_id <- involvements |> filter(cohort_name == "Reduced timetable") |> select(stud_id) |> unique() reduced_timetable_age <- involvements |> filter(cohort_name == "Reduced timetable") |> select(stud_id, cohort_name, open_date) |> left_join(attend |> filter(stud_id %in% reduced_timetable_stud_id$stud_id) |> select(stud_id, dob) |> unique(), by = "stud_id" ) |> mutate(age = floor(as.numeric( (open_date - dob)/365.25))) |> group_by(age) |> summarise(child_count = n_distinct(stud_id)) ggplot(reduced_timetable_age |> filter(age >= 0, age <= 16), aes(x = age, y = child_count, label = child_count)) + geom_col(fill = "steel blue") + geom_text(colour = "steelblue", size = 2, vjust = -1)+ theme(axis.text.y = eb, axis.title.y = eb, axis.line.y = eb, axis.ticks.y = eb) + labs(title = "Age on involvement start - Reduced timetable", x = "age on involvement open date") + scale_x_continuous(breaks = seq(4,16, by = 1)) ``` ```{r} #| label: plot Reduced timetable average attendance before and after inv_summary_long_start |> filter(cohort_name == "Reduced timetable") |> ggplot(aes(x = term_index, y = mean.percent_present ) ) + geom_point(colour = "steel blue", size = 2) + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Average attendance before and after Reduced timetable", subtitle = "Mean percentage of available sessions attended +- 95 CI", caption = "data from Capita One", x = "half term (relative to start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") ``` Reduced timetables are associated with a reduction in all absence codes: ```{r} #| label: plot before & after by reason code Reduced timetable before_after_ht_fun(cohort_name_selection = "Reduced timetable") |> ggplot(aes(x = name, y = mean, fill = prior_post, label = scales::percent(mean, accuracy = 0.01))) + geom_col(position = "dodge") + geom_errorbar(aes(ymax = upper_ci,ymin = lower_ci), width = 0.2, alpha = 0.5, position = position_dodge(width = 0.9), colour = "gray20") + geom_text(position = position_dodge(width = 0.9), vjust = -0.2, size = 3) + scale_fill_met_d("Egypt") + scale_colour_met_d("Egypt") + scale_y_continuous(labels = scales::percent) + barplottheme_minimal + labs(title = "Absence reasons before and after reduced timetable", subtitle = "Average % of sessions missed one year either side of involvement or episode start date", caption = "data from Capita One") + theme(plot.title = element_markdown(size = 12), legend.position = "none", axis.title = eb, axis.text.y = eb) ``` The categorised attendance data reveals large increases in the 0-50% attendance bracked, presumably as a result of the reduced timetable arrangement. There are no lasting improvements to the higher attendance brackets: ```{r} #| label: categorised stack plot reduced timetable #| eval: false # stacked bar of % category stack_data <- involvements |> filter(cohort_name == "Reduced timetable") |> select(stud_id,starts_with("attend_cat")) |> pivot_longer(-stud_id,names_to = "index", values_to = "attend_cat") |> mutate(index = str_replace(index,"attend_cat_","")) |> mutate(index = str_replace(index,"start_","")) |> mutate(index = str_replace(index,"_"," ")) |> mutate(index = factor(index, levels = c("prior 12","prior 11","prior 10","prior 9","prior 8","prior 7","prior 6","prior 5","prior 4","prior 3","prior 2","prior 1","post 1","post 2","post 3","post 4","post 5","post 6","post 7","post 8","post 9","post 10","post 11","post 12"))) |> group_by(index, attend_cat) |> filter(attend_cat != "out of scope") |> #filter(!attend_cat %in% c("out of scope","preschool","leaving age","covid")) |> tally() |> mutate(freq = n / sum(n)) |> filter(!is.na(index)) |> mutate(attend_cat = factor(attend_cat, levels = c("NA","0 - 50%","50 - 80%","80 - 90%","90 - 100%","preschool","leaving age","covid","out of scope"))) ggplot(stack_data, aes(x = index, y = freq, label = freq, fill = attend_cat, group = attend_cat)) + geom_col(position = position_stack(reverse = TRUE))+ #geom_text(position = position_stack(reverse = TRUE, vjust = 0.5), size = 2.5, angle = 90) + theme(axis.title.y = eb, legend.position = "right", axis.text.x = element_text(angle = 90, vjust = 0.5)) + scale_fill_manual(values = stack_colours, guide = guide_legend(reverse = TRUE)) + scale_y_continuous(labels = scales::percent) + labs(title = "Categorised attendance before & after - Reduced timetable", subtitle = "Percentage of pupils in each category", x = "half term index (relative to involvement start date)", caption = "data from Capita One", fill = "") + geom_vline(xintercept = 12.5, linetype = "dashed", colour = "gray40") + annotate(geom = "text", label = str_wrap("inv start", width = 10), y = 1.05, x = 13.5, colour = "gray40", size = 2.5) + coord_cartesian(clip = "off") ``` ## Current EHCP The 'Current EHCP' involvement simply tracks children with an Education, Health & Care Plan. And although not an intervention in & of itself, it is a plan to meet the child's needs, so will be associated with other changes. There are around 800 EHC of these starting each year - though changes in demand & the council's response to that demand, may impact volumes. 2023 seems to have been a peak year. ```{r} #| label: current ehcp volumes #| out-width: 75% #| fig-align: "center" involvements |> filter(cohort_name == "Current EHCP") |> group_by(exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "Current EHCP - volumes", subtitle = "involvements starting by accademic year; 2025 is part year only", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2025)) + coord_cartesian(clip = "off") ``` The current EHCP start date sees a sustained increase in average attendance levels: ```{r} #| label: plot current EHCP average attendance before and after inv_summary_long_start |> filter(cohort_name == "Current EHCP") |> ggplot(aes(x = term_index, y = mean.percent_present ) ) + geom_point(colour = "steel blue", size = 2) + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Average attendance before and after Current EHCP involvement", subtitle = "Mean percentage of available sessions attended +- 95 CI", caption = "data from Capita One", x = "half term (relative to start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") ``` The EHCP start date is associated with a reduction in almost all coded absence reasons: ```{r} #| label: plot before & after by reason code Current EHCP before_after_year_fun(cohort_name_selection = "Current EHCP") |> ggplot(aes(x = name, y = mean, fill = prior_post, label = scales::percent(mean, accuracy = 0.01))) + geom_col(position = "dodge") + geom_errorbar(aes(ymax = upper_ci,ymin = lower_ci), width = 0.2, alpha = 0.5, position = position_dodge(width = 0.9), colour = "gray20") + geom_text(position = position_dodge(width = 0.9), vjust = -0.2, size = 3) + scale_fill_met_d("Egypt") + scale_colour_met_d("Egypt") + scale_y_continuous(labels = scales::percent) + barplottheme_minimal + labs(title = "Absence reasons before and after Current EHCP involvement", subtitle = "Average % of sessions missed one year either side of involvement or episode start date", caption = "data from Capita One") + theme(plot.title = element_markdown(size = 12), legend.position = "none", axis.title = eb, axis.text.y = eb) ``` ```{r} # get involvements that have available data ehcp_available <- involvements |> ungroup() |> filter(cohort_name == "Current EHCP") |> #select(involvement_id, present_prior_1, present_prior_2, present_prior_3) |> rowwise() |> mutate(prior_data_available = ( if_else(!is.na(percent_present_prior_1),1,0) + if_else(!is.na(percent_present_prior_2),1,0) + if_else(!is.na(percent_present_prior_3),1,0) + if_else(!is.na(percent_present_prior_4),1,0) + if_else(!is.na(percent_present_prior_5),1,0) + if_else(!is.na(percent_present_prior_6),1,0) ), post_data_available = ( if_else(!is.na(percent_present_start_post_1),1,0) + if_else(!is.na(percent_present_start_post_2),1,0) + if_else(!is.na(percent_present_start_post_3),1,0) + if_else(!is.na(percent_present_start_post_4),1,0) + if_else(!is.na(percent_present_start_post_5),1,0) + if_else(!is.na(percent_present_start_post_6),1,0) )) |> filter(prior_data_available >= 3, post_data_available >= 3) |> select(involvement_id) ehcp_categorised <- involvements |> #filter(involvement_id %in% ehcp_available$involvement_id) |> filter(cohort_name == "Current EHCP") |> select(involvement_id, stud_id, year = exam_year_inv_start, age_on_inv_start, open_date, prior = percent_present_prior_year, post = percent_present_start_post_year ) |> mutate(diff = post - prior) |> mutate(diff_cat = case_when(diff == 0 ~ "no change", diff > 0 ~ "increase", TRUE ~ "decrease") ) |> filter(!is.na(diff)) |> left_join(stud_details_joined, by = "stud_id") ``` ```{r} #| label: plot current ehcp change by age band #| fig-height: 3.5 #| eval: false ehcp_categorised |> left_join(attend_stud_year_ncy |> select(stud_id,ncy)) |> group_by(ncy) |> summarise_before_after() |> ggplot(aes(x = ncy, y = mean.diff, label = scales::percent(mean.diff))) + geom_col(fill = "steel blue") + geom_text(colour = "steel blue", aes(vjust = if_else(mean.diff < 0,1,-1))) + geom_errorbar(aes(ymin = lower.ci.diff, ymax = upper.ci.diff), width = 0.2, colour = "gray40", alpha = 0.7) + scale_x_continuous(breaks = seq(1,11)) + labs(title = "Attendance pre and post Attendance Advice, by NCY on involvement start", subtitle = "Mean % of sessions attended +/- 1 yr from involvement start, only those with attendance data available for at least 3 of 6 half terms either side", caption = "data from Capita One") ``` ```{r} #| label: plot current ehcp by year of involvement start #| eval: false ehcp_categorised |> #left_join(attend_stud_year_ncy |> select(stud_id,ncy)) |> group_by(year) |> summarise_before_after() |> ggplot(aes(x = year, y = mean.diff, label = scales::percent(mean.diff))) + geom_col(fill = "steel blue") + geom_text(colour = "steel blue", aes(vjust = if_else(mean.diff < 0,1,-1))) + geom_errorbar(aes(ymin = lower.ci.diff, ymax = upper.ci.diff), width = 0.2, colour = "gray40", alpha = 0.7) + #scale_x_continuous(breaks = seq(1,11)) + labs(title = "Mean change in annual attendance following Attendance Advice by year", subtitle = "Involvements with attendance data available for at least 3 of 6 half terms either side of the open date", caption = "data from Capita One") ``` ```{r} #| label: plot current ehcp change by primary specific need #| eval: false ehcp_categorised |> group_by(primary_specific_need) |> summarise_before_after() |> ggplot( aes(x = primary_specific_need, y = mean.diff, label = scales::percent(mean.diff))) + geom_col(fill = "steel blue", width = 0.75) + geom_text(colour = "steel blue", aes(vjust = if_else(mean.diff < 0,1,-1))) + geom_errorbar(aes(ymin = lower.ci.diff, ymax = upper.ci.diff), width = 0.2, colour = "gray40") + labs(title = "Mean change in annual attendance following Currrent EHCP involvement, by primary specific need", subtitle = "Involvements with attendance data available for at least 3 of 6 half terms either side of the open date", caption = "data from Capita One") + barplottheme_minimal + coord_flip() ``` ```{r} #| eval: false ehcp_categorised |> group_by(imd_quartile) |> summarise_before_after() |> ggplot( aes(x = imd_quartile, y = mean.diff, label = scales::percent(mean.diff))) + geom_col(fill = "steel blue") + geom_text(colour = "steel blue", aes(vjust = if_else(mean.diff < 0,1,-1))) + geom_errorbar(aes(ymin = lower.ci.diff, ymax = upper.ci.diff), width = 0.2, colour = "gray40") + labs(title = "Average change in attendance following current EHCP involvement, by IMD quartile", subtitle = "1 = most affluent wards; 4 = most deprived wards") + barplottheme_minimal ``` The categorised attendance plot shows a sustained increase in all higher attendendance brackets, and a reduction in the severaly absent bracket: ```{r} #| label: categorised stack plot current EHCP # stacked bar of % category stack_data <- involvements |> filter(cohort_name == "Current EHCP") |> select(stud_id,starts_with("attend_cat")) |> pivot_longer(-stud_id,names_to = "index", values_to = "attend_cat") |> mutate(index = str_replace(index,"attend_cat_","")) |> mutate(index = str_replace(index,"start_","")) |> mutate(index = str_replace(index,"_"," ")) |> mutate(index = factor(index, levels = c("prior 12","prior 11","prior 10","prior 9","prior 8","prior 7","prior 6","prior 5","prior 4","prior 3","prior 2","prior 1","post 1","post 2","post 3","post 4","post 5","post 6","post 7","post 8","post 9","post 10","post 11","post 12"))) |> group_by(index, attend_cat) |> filter(attend_cat != "out of scope") |> filter(!attend_cat %in% c("out of scope","preschool","leaving age","covid")) |> tally() |> mutate(freq = n / sum(n)) |> filter(!is.na(index)) |> mutate(attend_cat = factor(attend_cat, levels = c("NA","0 - 50%","50 - 80%","80 - 90%","90 - 100%","preschool","leaving age","covid","out of scope"))) ggplot(stack_data, aes(x = index, y = freq, label = freq, fill = attend_cat, group = attend_cat)) + geom_col(position = position_stack(reverse = TRUE))+ #geom_text(position = position_stack(reverse = TRUE, vjust = 0.5), size = 2.5, angle = 90) + theme(axis.title.y = eb, legend.position = "right", axis.text.x = element_text(angle = 90, vjust = 0.5)) + scale_fill_manual(values = stack_colours, guide = guide_legend(reverse = TRUE)) + scale_y_continuous(labels = scales::percent) + labs(title = "Categorised attendance before & after - current EHCP", subtitle = "Percentage of pupils in each category", x = "half term index (relative to involvement start date)", caption = "data from Capita One", fill = "") + geom_vline(xintercept = 12.5, linetype = "dashed", colour = "gray40") + annotate(geom = "text", label = str_wrap("inv start", width = 10), y = 1.05, x = 13.5, colour = "gray40", size = 2.5) + coord_cartesian(clip = "off") ``` ## Autism Team We start around 600 of these involvements per year: ```{r} #| label: autism team volumes #| out-width: 75% #| fig-align: "center" involvements |> filter(cohort_name == "Autism team") |> group_by(exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "Autism team - volumes", subtitle = "involvements starting by accademic year; 2025 is part year only", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2025)) + coord_cartesian(clip = "off") ``` The 'before & after' plot only shows a small change in attendance levels. But since the team become involved around age 5 most children do not have much of a 'before' period to consider. ```{r} #| label: plot autism team average attendance before and after #| fig-height: 3.5 inv_summary_long_start |> filter(cohort_name == "Autism team") |> ggplot(aes(x = term_index, y = mean.percent_present ) ) + geom_point(colour = "steel blue", size = 2) + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Average attendance before and after Autism Team involvement", subtitle = "Mean percentage of available sessions attended +- 95 CI", caption = "data from Capita One", x = "half term (relative to start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") ``` In the coded year-on-year veiw, the Autism team start date is associated with an increase in almost all types of absence. ```{r} #| label: plot before & after by reason code autism team before_after_year_fun(cohort_name_selection = "Autism team") |> ggplot(aes(x = name, y = mean, fill = prior_post, label = scales::percent(mean, accuracy = 0.01))) + geom_col(position = "dodge") + geom_errorbar(aes(ymax = upper_ci,ymin = lower_ci), width = 0.2, alpha = 0.5, position = position_dodge(width = 0.9), colour = "gray20") + geom_text(position = position_dodge(width = 0.9), vjust = -0.2, size = 3) + scale_fill_met_d("Egypt") + scale_colour_met_d("Egypt") + scale_y_continuous(labels = scales::percent) + barplottheme_minimal + labs(title = "Absence reasons before and after Autism Team involvement", subtitle = "Average % of sessions missed one year either side of involvement or episode start date", caption = "data from Capita One") + theme(plot.title = element_markdown(size = 12), legend.position = "none", axis.title = eb, axis.text.y = eb) ``` The assumption here has to be that these year on year increases reflect the steeper line seen in the 'with & without' analysis below, and are explained by the same greater severity of need. To create a "with vs without the team" analysis, we take all children in the attendance data, with a primary specific need of Autism at some point in the attendance data (so we're taking a "lifetime diagnosis" approach). This cohort is then categorised into those who had involvement from the team and those who did not. We see a consistently lower attendance for children involved with the team, as well as a consistently steeper drop in attendance. Once again, we have to assume that this reflects differences in the severity of need - with the team becoming involved in the more severe cases - and that there are factors at work here that are not recorded in the data. ```{r Autism team cohort comparison} # filter attendance data to just get kids with ASD as a sen specific need comp_autism_tm <- attend |> filter(year >= 2021) |> filter(primary_specific_need == "Autism") # filter involvements to just get 0 to 5 team # in this case (Autism team) we'll just take any match on stud ID as involvement by the team, rather than looking for dates inv_autism_tm <- involvements |> filter(cohort_name == "Autism team", #age_on_inv_start >= 6 ) |> select(stud_id, open_date, close_date) |> rename(inv_open_date = open_date, inv_close_date = close_date) |> mutate(inv_flag = 1) |> distinct() # join involvements to attendance data comp_autism_tm <- comp_autism_tm |> left_join( inv_autism_tm, join_by(stud_id == stud_id, ht_start_date >= inv_open_date, # commenting this line means we just take any match as team involvement ht_start_date <= inv_close_date ) ) |> mutate(inv_flag = case_when( is.na(inv_flag) ~ 0, TRUE ~ inv_flag ) ) |> distinct() |> group_by(inv_flag, ncy) |> summarise (mean.percent_present = mean(percent_present, na.rm = TRUE), sd.percent_present = sd(percent_present, na.rm = TRUE), n.percent_present = n() ) |> mutate(se.percent_present = sd.percent_present / sqrt(n.percent_present), lower.ci.percent_present = mean.percent_present - qt(1 - (0.05 / 2), n.percent_present - 1) * se.percent_present, upper.ci.percent_present = mean.percent_present + qt(1 - (0.05 / 2), n.percent_present - 1) * se.percent_present) |> filter(ncy > 0) # plot ggplot(comp_autism_tm |> filter(ncy >= 0), aes(x = ncy, colour = factor(inv_flag), y = mean.percent_present)) + geom_point()+ geom_line()+ #geom_col(position = position_dodge(0.9)) +#, fill = "#0072B2")+ geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2)+ labs(title = "Attendance by curriculum year without and with Autism team involvement", subtitle = "Average percentage of sessions attended +-95 CI; SEN specific need of Autism spectrum disorder 2021 to 2023", x = "new curriculum year")+ barplottheme_minimal + theme(axis.title.x = eb, #axis.text.y = eb, plot.title = element_markdown(size = 12), legend.position = "none" ) + scale_x_continuous(breaks = seq(0,11)) + scale_y_continuous(labels = scales::percent) + MetBrewer::scale_fill_met_d("Egypt") ``` But before we move on it's worth considering the age on in involvement start date. The autism team generally become involved around age 5, but can be at any age - and the team can become involved before a formal diagnosis is in place: ```{r} #| label: plot age on first involvement date - autism team autism_team_inv_stud_id <- involvements |> filter(cohort_name == "Autism team") |> select(stud_id) |> unique() autism_team_age <- involvements |> filter(cohort_name == "Autism team") |> select(stud_id, cohort_name, open_date) |> left_join(attend |> filter(stud_id %in% autism_team_inv_stud_id$stud_id) |> select(stud_id, dob) |> unique(), by = "stud_id" ) |> mutate(age = floor(as.numeric( (open_date - dob)/365.25))) |> group_by(age) |> summarise(child_count = n_distinct(stud_id)) ggplot(autism_team_age, aes(x = age, y = child_count)) + geom_col(fill = "steel blue") + theme(axis.text.y = eb, axis.title.y = eb, axis.line.y = eb, axis.ticks.y = eb) + labs(title = "Age on involvement start - Autism team", x = "age on involvement open date") + xlim(limits = c(0,20)) ``` If we categorise children according to their age on first involvement date, there are some interesting differences in patterns of attendance. Here we've created three groups: those aged 5 and under when the team first become involved, those aged 6 to 10, and those aged 11+. 2016 seems to have been a peak year for all age groups (which may be a data quality issue) but 2016 aside, the voumes of all 3 of theses groups have increased recently: ```{r} #| label: plot autism team volumes by year and age on involvement start #| warning: false #| autism_team_age <- involvements |> filter(cohort_name == "Autism team") |> select(stud_id, cohort_name, open_date) |> mutate(year_open = year(open_date)) |> left_join(attend |> filter(stud_id %in% autism_team_inv_stud_id$stud_id) |> select(stud_id, dob) |> unique(), by = "stud_id" ) |> mutate(age = as.numeric( (open_date - dob)/365.25)) |> rename(year = year_open) |> mutate(year_grp = case_when(year %in% c(2013,2014,2015) ~ "2013 - 2015", year %in% c(2016,2017,2018) ~ "2016 - 2018", year %in% c(2019,2020,2021) ~ "2019 - 2021", year %in% c(2022,2023,2024, 2025) ~ "2022 - 2025"), age_grp = case_when(age <= 5 ~ "under 5", age <= 11 ~ "5 to 10", TRUE ~ "11+") ) |> mutate(age_grp = factor(age_grp, levels = c("under 5","5 to 10","11+")))|> group_by(year, age_grp) |> tally() |> filter(year != 2025) |> ungroup() |> mutate(label = if_else(year == max(year), age_grp, NA_character_)) autism_team_age |> ggplot(aes(y = n, x = year, #fill = age_grp, colour = age_grp, group = age_grp))+ geom_point() + geom_line() + geom_label_repel(aes(x = 2025, label = label), direction = "y", min.segment.length = Inf, size = 3.2) + scale_x_continuous(limits = c(2012,2027)) + labs(title = "Autism team volumes by year and age on involvement start", subtitle = "count of involvements starting per year", x = "year of involvement start") + theme(axis.title.y = eb, legend.position = "none", axis.line = eb, axis.ticks = eb) ``` Plotting attendance by national curriculum year for these three groups shows that children who have inolvement with the team earlier in life have better attendance throughout their school career than those who have team involvement starting later. This may result from the actions & support of the team, or perhaps the impact of having a diagnosis at different ages, or it may result from effects upstream of these differences in diagnosis age, none of which we can unpick from the data, but the effect is there: ```{r Autism team cohort comparison lifetime diagnosis} # get autism team involvements and group by age bracket of first involvement date aut_tm_inv <- involvements |> filter(cohort_name == "Autism team") |> group_by(stud_id) |> summarise(inv_age = min(floor(age_on_inv_start))) |> ungroup() |> mutate(inv_age = case_when(inv_age <= 5 ~ "under 5", inv_age <= 11 ~ "5 to 10", TRUE ~ "11+")) |> select(stud_id, inv_age) |> mutate(inv_age = factor(inv_age, levels = c("under 5","5 to 10","11+"))) # attendance data of those in the basic cohort but with no team involvement (comparator) aut_tm_attend_by_age_on_start <- inner_join(attend, aut_tm_inv, by = "stud_id") |> filter(ncy > 0, ncy < 12) |> group_by(ncy, inv_age) |> presence_mean_calc() |> ungroup() |> mutate(label = if_else(ncy == max(ncy), inv_age, NA_character_)) # plot ggplot(aut_tm_attend_by_age_on_start, aes(x = ncy, colour = inv_age, y = mean.percent_present)) + geom_point() + geom_line(linetype = "dotted") + geom_label_repel(aes(x = 12, label = label), direction = "y", min.segment.length = Inf, size = 3.2) + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2)+ labs(title = "Attendance by curriculum year and age on first involvement start - Autism team", subtitle = "Average percentage of sessions attended +-95 CI", x = "new curriculum year")+ barplottheme_minimal + theme( plot.title = element_markdown(size = 12), legend.position = "none") + scale_x_continuous(breaks = seq(0,11), limits = c(0,13)) + scale_y_continuous(labels = scales::percent) + annotate(geom = "text", label = str_wrap("age on first involvement date",width = 15), x = 12, y = 0.88, size = 3, colour = "grey40") ``` ## Inclusion Advice ```{r} #| label: inclusion advice volumes #| fig-align: "center" #| fig-height: 3.5 involvements |> filter(cohort_name == "Inclusion Advice") |> group_by(exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "Inclusion Advice volumes", subtitle = "involvements starting by academic year; 2025 is part year", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2025)) + coord_cartesian(clip = "off") ``` ```{r plot Inclusion Advice attendance} inv_summary_long_start |> filter(cohort_name == "Inclusion Advice") |> ggplot(aes(x = term_index, y = mean.percent_present ) ) + geom_point(colour = "steel blue") + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Attendance before and after intervention - Inclusion Advice", subtitle = "Average percentage of available sessions attended +- 95 CI", x = "half term (relative to involvement start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") ``` Inclusion advice involvements are associated with a reduction in exclusion rates, but exclusion rates persist for these children, remaining far above the average rate (shown here with a dotted line) ```{r} #| label: plot Inclusion Advice exclusions #| fig-height: 3.5 inv_summary_long_start |> filter(cohort_name == "Inclusion Advice") |> ggplot(aes(x = term_index, y = mean.percent_excluded) ) + geom_point(colour = "dark red") + geom_errorbar(aes(ymin = lower.ci.percent_excluded, ymax = upper.ci.percent_excluded), width = 0.2, colour = "dark red")+ labs(title = "Exclusion before and after intervention - Inclusion Advice", subtitle = "Average percentage of available sessions where the child was excluded +- 95 CI", x = "half term (relative to involvement start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") + geom_hline(yintercept = excl_2023_avg, linetype = "dashed", colour = "dark grey") + geom_text(aes(x = 4, label = paste("2023 avg",scales::percent(excl_2023_avg, accuracy = 0.1L)), y = excl_2023_avg), size = 2.5, vjust = 1, colour = "dark grey") ``` Inclusion advice is associated with an increase in "no reason" absences reasons, and _absences due to exclusions_. This fits with what we see above - exclusion rates are higher year on year for children who receive inclusion advice. ```{r} #| label: plot before & after by reason code inclusion advice #| fig-height: 4 before_after_year_fun(cohort_name_selection = "Inclusion Advice") |> ggplot(aes(x = name, y = mean, fill = prior_post, label = scales::percent(mean, accuracy = 0.01))) + geom_col(position = "dodge") + geom_errorbar(aes(ymax = upper_ci,ymin = lower_ci), width = 0.2, alpha = 0.5, position = position_dodge(width = 0.9), colour = "gray20") + geom_text(position = position_dodge(width = 0.9), vjust = -0.2, size = 3) + scale_fill_met_d("Egypt") + scale_colour_met_d("Egypt") + scale_y_continuous(labels = scales::percent) + barplottheme_minimal + labs(title = "Absence reasons before and after Inclusion Advice", subtitle = "Average % of sessions missed one year either side of involvement or episode start date", caption = "data from Capita One") + theme(plot.title = element_markdown(size = 12), legend.position = "none", axis.title = eb, axis.text.y = eb) + scale_x_discrete(labels = function(x) str_wrap(x, width = 2)) ``` ## Elective Home Education (EHE) :::{.callout-warning} It seems unclear from the data whether the attendance data following an EHE involvement is attendance back at school following a short period of home education, or if it shows attendance *in* the home education setting. But do home educators submit attendance data? Because of this uncertainty, the analysis included here is limited, but from the available data it seems a period of EHE results in increased attendance at school. ::: ```{r} #| label: plot EHE average attendance before and after inv_summary_long_start |> filter(cohort_name == "EHE") |> ggplot(aes(x = term_index, y = mean.percent_present ) ) + geom_point(colour = "steel blue", size = 2) + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Average attendance before and after Elective Home Education", subtitle = "Mean percentage of available sessions attended +- 95 CI", caption = "data from Capita One", x = "half term (relative to start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") ``` ## Early Years Inclusion The Early Years inclusion team work almost entirely with children younger than school age, so a 'before & after' approach will not work here. ```{r} #| label: EY inclusion volumes and age on start ##| out-width: 75% ##| fig-height: 3 #| fig-align: "center" #| fig-ncol: 2 involvements |> filter(cohort_name == "EY Inclusion", exam_year_inv_start >= 2019, exam_year_inv_start <= 2024) |> group_by(exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "Early Years Inclusion - volumes", subtitle = "involvements starting by accademic year; 2025 is part year only", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2025)) + coord_cartesian(clip = "off") ey_inclusion_inv_stud_id <- involvements |> filter(cohort_name == "EY Inclusion") |> select(stud_id) |> unique() ey_inclusion_age <- involvements |> filter(cohort_name == "EY Inclusion") |> select(stud_id, cohort_name, open_date) |> left_join(attend |> filter(stud_id %in% ey_inclusion_inv_stud_id$stud_id) |> select(stud_id, dob) |> unique(), by = "stud_id" ) |> mutate(age = floor(as.numeric( (open_date - dob)/365.25))) |> group_by(age) |> summarise(child_count = n_distinct(stud_id)) ggplot(ey_inclusion_age, aes(x = age, y = child_count, label = child_count)) + geom_col(fill = "steelblue") + geom_text(colour = "steelblue", size = 3, vjust = -1)+ theme(axis.text.y = eb, axis.title.y = eb, axis.line.y = eb, axis.ticks.y = eb) + labs(title = "Age on involvement start - Early Years Inclusion", x = "age on involvement open date") + scale_x_continuous(breaks = seq(0:10), limits = c(0,10)) ``` Since the team work with SEN children, it makes sense to compare the attendance levels of children who require SEN support or an EHCP plan, but with and without the EY inclusion team involvement. In both cases, (after year 0 at least) we see better attendance in all years when the team are involved. (note that the 'with the team' line here cuts off in year ) ```{r} #| label: EY inclusion cohort comparison - sen support attendance # filter attendance data to just get SEN support kids Y5 and below comp_EY <- attend |> filter(year >= 2021) |> filter(sen_level == "SEN Support") inv_EY <- involvements |> filter(cohort_name == "EY Inclusion") |> select(stud_id) |> mutate(inv_flag = 1) |> distinct() # join involvements to attendance data comp_EY <- comp_EY |> left_join(inv_EY, by = "stud_id") |> mutate(inv_flag = case_when( is.na(inv_flag) ~ 0,TRUE ~ inv_flag)) |> distinct() comp_EY_summary <- comp_EY |> group_by(inv_flag, ncy) |> summarise (mean.percent_present = mean(percent_present, na.rm = TRUE), sd.percent_present = sd(percent_present, na.rm = TRUE), n.percent_present = n() ) |> mutate(se.percent_present = sd.percent_present / sqrt(n.percent_present), lower.ci.percent_present = mean.percent_present - qt(1 - (0.05 / 2), n.percent_present - 1) * se.percent_present, upper.ci.percent_present = mean.percent_present + qt(1 - (0.05 / 2), n.percent_present - 1) * se.percent_present) # plot ggplot(comp_EY_summary |> filter(ncy >= 0), aes(x = ncy, y = mean.percent_present, colour = factor(inv_flag))) + geom_point() + geom_line(linetype = "dotted") + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2)+ geom_text(aes(label = scales::percent(round(mean.percent_present,3), accuracy = 0.1)), vjust = 3, colour = "white", size = 3, position = position_dodge(0.9)) + labs(title = "Attendance by curriculum year - children requiring SEN Support without and with Early Years Inclusion involvement", subtitle = "percentage of available sessions attended +- 95 CI", x = "new curriculum year")+ barplottheme_minimal + theme(axis.title.x = eb, plot.title = element_markdown(size = 10), legend.position = "none" ) + scale_x_continuous(breaks = seq(0,11)) + scale_y_continuous(labels = scales::percent) + MetBrewer::scale_fill_met_d("Egypt") ``` ```{r} #| label: EY inclusion cohort comparison - ehcp support attendance # filter attendance data to just get ehcp kids Y5 and below comp_EY <- attend |> filter(year >= 2021) |> filter(sen_level == "EHCP") inv_EY <- involvements |> filter(cohort_name == "EY Inclusion") |> select(stud_id) |> mutate(inv_flag = 1) |> distinct() # join involvements to attendance data comp_EY <- comp_EY |> left_join(inv_EY, by = "stud_id") |> mutate(inv_flag = case_when( is.na(inv_flag) ~ 0,TRUE ~ inv_flag)) |> distinct() comp_EY_summary <- comp_EY |> group_by(inv_flag, ncy) |> summarise (mean.percent_present = mean(percent_present, na.rm = TRUE), sd.percent_present = sd(percent_present, na.rm = TRUE), n.percent_present = n() ) |> mutate(se.percent_present = sd.percent_present / sqrt(n.percent_present), lower.ci.percent_present = mean.percent_present - qt(1 - (0.05 / 2), n.percent_present - 1) * se.percent_present, upper.ci.percent_present = mean.percent_present + qt(1 - (0.05 / 2), n.percent_present - 1) * se.percent_present) # plot ggplot(comp_EY_summary |> filter(ncy >= 0), aes(x = ncy, y = mean.percent_present, colour = factor(inv_flag))) + geom_point() + geom_line(linetype = "dotted") + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2)+ geom_text(aes(label = scales::percent(round(mean.percent_present,3), accuracy = 0.1)), vjust = 3, colour = "white", size = 3, position = position_dodge(0.9)) + labs(title = "Attendance by curriculum year, children with an EHC Plan without and with Early Years Inclusion team involvement", subtitle = "percentage of available sessions attended +- 95 CI", x = "new curriculum year")+ barplottheme_minimal + theme(#axis.title.x = eb, #axis.text.y = eb, plot.title = element_markdown(size = 10), legend.position = "none" ) + scale_x_continuous(breaks = seq(0,11)) + scale_y_continuous(labels = scales::percent) + MetBrewer::scale_fill_met_d("Egypt") ``` ## Exclusion Concern This involvement identifies children with high levels of temporary and permanent exclusions. This is growing over time, presumably as exclusion rates are also growing. ```{r} #| label: Exclusion concern volumes #| out-width: 75% #| fig-height: 3 #| fig-align: "center" involvements |> filter(cohort_name == "Exclusion Concern", exam_year_inv_start >= 2019, exam_year_inv_start <= 2024) |> group_by(exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "Exclusion Concern - volumes", subtitle = "involvements starting by accademic year; 2025 is part year only", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2025)) + coord_cartesian(clip = "off") ``` The focus here is exclusions, and looking at the before & after plot for exclusions, we see extremely high exclusion rates prior to involvement, and a significant reduction afterward, but rates continue to be well above average: ```{r} #| label: plot Exclusione Concern exclusions #| fig-height: 3.5 inv_summary_long_start |> filter(cohort_name == "Exclusion Concern") |> ggplot(aes(x = term_index, y = mean.percent_excluded) ) + geom_point(colour = "dark red") + geom_errorbar(aes(ymin = lower.ci.percent_excluded, ymax = upper.ci.percent_excluded), width = 0.2, colour = "dark red")+ labs(title = "Exclusion before and after intervention - Exclusion Concern", subtitle = "Average percentage of available sessions where the child was excluded +- 95 CI", x = "half term (relative to involvement start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") + geom_hline(yintercept = excl_2023_avg, linetype = "dashed", colour = "dark grey") + geom_text(aes(x = 4, label = paste("2023 avg",scales::percent(excl_2023_avg, accuracy = 0.1L)), y = excl_2023_avg), size = 2.5, vjust = 1, colour = "dark grey") ``` Exclusion concern involvements often correllated with other involvement types - inclusion panel, think for the future and reduced timetables. ## PA Cohort Tracking This involvement simply means the child is persistently absent, and is being tracked as such. (see also SA cohort tracking, below) ```{r} #| label: PA Cohort Tracking volumes #| fig-align: "center" #| fig-height: 3.5 involvements |> filter(cohort_name == "PA Cohort Tracking") |> group_by(exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "PA Cohort Tracking volumes", subtitle = "involvements starting by academic year; 2025 is part year", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2025)) + coord_cartesian(clip = "off") ``` The PA cohort tracking involvement shows no particular improvement in attendance levels: ```{r} #| label: plot PA cohort tracking before after attendance #| fig-height: 3.5 inv_summary_long_start |> filter(cohort_name == "PA Cohort Tracking") |> ggplot(aes(x = term_index, y = mean.percent_present ) ) + geom_point(colour = "steel blue") + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Attendance before and after intervention - PA Cohort Tracking", subtitle = "Average percentage of available sessions attended +- 95 CI", x = "half term (relative to involvement start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") ``` ## Consultation N.B. At the time of writing this involvement is not understood - we had feedback that it was not in use in late 2023, but there are volumes of activity in 2024 and 2025. ```{r} #| label: consultation volumes #| fig-align: "center" #| fig-height: 3.5 involvements |> filter(cohort_name == "Consultation") |> group_by(exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "Consultation involvement volumes", subtitle = "involvements starting by academic year; 2025 is part year", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2025)) + coord_cartesian(clip = "off") ``` The "Consultation" involvement before & after plot shows _very_ low attendance prior to involvement, and the trend continuing - though we have to consider that activity prior to 2024 is very limited, and so there simply isn't enough elapsed time to give data here. ```{r} #| label: plot Consultation before after attendance #| fig-height: 3.5 inv_summary_long_start |> filter(cohort_name == "Consultation") |> ggplot(aes(x = term_index, y = mean.percent_present ) ) + geom_point(colour = "steel blue") + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Attendance before and after intervention - Consultation", subtitle = "Average percentage of available sessions attended +- 95 CI", x = "half term (relative to involvement start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent, limits = c(0,1)) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") ``` ## Progressions Team The progressions team are involved in matching children to alternative provision. ```{r} #| label: Progressions team volumes #| fig-align: "center" #| fig-height: 3.5 involvements |> filter(cohort_name == "Progressions Team") |> group_by(exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "Progressions Team volumes", subtitle = "involvements starting by academic year; 2025 is part year", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2025)) + coord_cartesian(clip = "off") ``` ```{r} prog_tm_inv_stud_id <- involvements |> filter(cohort_name == "Progressions Team") |> select(stud_id) |> unique() prog_tm_age <- involvements |> filter(cohort_name == "Progressions Team") |> select(stud_id, cohort_name, open_date) |> left_join(attend |> filter(stud_id %in% prog_tm_inv_stud_id$stud_id) |> select(stud_id, dob) |> unique(), by = "stud_id" ) |> mutate(age = floor(as.numeric( (open_date - dob)/365.25))) |> group_by(age) |> summarise(child_count = n_distinct(stud_id)) ggplot(prog_tm_age, aes(x = age, y = child_count, label = child_count)) + geom_col(fill = "steelblue") + geom_text(colour = "steelblue", size = 3, vjust = -1)+ theme(axis.text.y = eb, axis.title.y = eb, axis.line.y = eb, axis.ticks.y = eb) + labs(title = "Age on involvement start - Early Years Inclusion", x = "age on involvement open date") + scale_x_continuous(breaks = seq(1:18), limits = c(3,16)) + coord_cartesian(clip = "off") ``` ```{r} #| label: table of progressions team count by sen level #| fig-align: centre involvements |> filter( ##exam_year_inv_start == 2024, cohort_name == "Progressions Team") |> group_by( #sen_need_category sen_level ) |> summarise(child_count = n_distinct(stud_id)) |> arrange(desc(child_count)) |> gt(rowname_col = "sen_level") |> tab_header( title = "Progressions Team", subtitle = "counts of children with involvements in 2024, by SEN level" ) |> tab_options( table.align = "left", heading.title.font.size = 12, heading.subtitle.font.size= 10, heading.align = "left", column_labels.font.size = 14, stub.font.size = 12, column_labels.hidden = TRUE ) |> cols_align("left",'sen_level') ``` There is no improvement to attendance levels associated with progressions team involvement: ```{r} #| label: plot progressions team before after attendance #| fig-height: 3.5 inv_summary_long_start |> filter(cohort_name == "Progressions Team") |> ggplot(aes(x = term_index, y = mean.percent_present ) ) + geom_point(colour = "steel blue") + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Attendance before and after intervention - Progressions Team", subtitle = "Average percentage of available sessions attended +- 95 CI", x = "half term (relative to involvement start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") ``` The progressions team involvement is associated with an increase in 'no reason' absences and absences due to exclusion: ```{r} #| label: plot before & after by reason code progressions team #| fig-height: 4 before_after_year_fun(cohort_name_selection = "Progressions Team") |> ggplot(aes(x = name, y = mean, fill = prior_post, label = scales::percent(mean, accuracy = 0.01))) + geom_col(position = "dodge") + geom_errorbar(aes(ymax = upper_ci,ymin = lower_ci), width = 0.2, alpha = 0.5, position = position_dodge(width = 0.9), colour = "gray20") + geom_text(position = position_dodge(width = 0.9), vjust = -0.2, size = 3) + scale_fill_met_d("Egypt") + scale_colour_met_d("Egypt") + scale_y_continuous(labels = scales::percent) + barplottheme_minimal + labs(title = "Absence reasons before and after Progressions Team", subtitle = "Average % of sessions missed one year either side of involvement or episode start date", caption = "data from Capita One") + theme(plot.title = element_markdown(size = 12), legend.position = "none", axis.title = eb, axis.text.y = eb) + scale_x_discrete(labels = function(x) str_wrap(x, width = 2)) ``` ## Penalty notice warning letter ```{r} #| label: penatly notice warning letter volumes #| fig-align: "center" #| fig-height: 3 involvements |> filter(cohort_name == "penalty notice warning letter") |> group_by(exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "Penalty notice warning letter volumes", subtitle = "involvements starting by accademic year", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2024)) + coord_cartesian(clip = "off") ``` Penalty notice warning letters don't appear to have any effect on overall attendance (this is in contrast to the Section 437 notices covered later on): ```{r} #| label: plot penalty notice warning letter average attendance before and after #| fig-height: 3.5 inv_summary_long_start |> filter(cohort_name == "penalty notice warning letter") |> ggplot(aes(x = term_index, y = mean.percent_present ) ) + geom_point(colour = "steel blue", size = 2) + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Average attendance before and after penalty notice warning letter", subtitle = "Mean percentage of available sessions attended +- 95 CI", caption = "data from Capita One", x = "half term (relative to start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") ``` Penalty notice warning letters are associated with year on year increases in most coded absence reasons, particularly 'no reason' absences: ```{r} #| label: plot before & after by reason code penalty notice warning letter #| fig-height: 4 before_after_year_fun(cohort_name_selection = "penalty notice warning letter") |> ggplot(aes(x = name, y = mean, fill = prior_post, label = scales::percent(mean, accuracy = 0.01))) + geom_col(position = "dodge") + geom_errorbar(aes(ymax = upper_ci,ymin = lower_ci), width = 0.2, alpha = 0.5, position = position_dodge(width = 0.9), colour = "gray20") + geom_text(position = position_dodge(width = 0.9), vjust = -0.2, size = 3) + scale_fill_met_d("Egypt") + scale_colour_met_d("Egypt") + scale_y_continuous(labels = scales::percent) + barplottheme_minimal + labs(title = "Absence reasons before and after penalty notice warning letter", subtitle = "Average % of sessions missed one year either side of involvement or episode start date", caption = "data from Capita One") + theme(plot.title = element_markdown(size = 12), legend.position = "none", axis.title = eb, axis.text.y = eb) + scale_x_discrete(labels = function(x) str_wrap(x, width = 2)) ``` ## SA Cohort Tracking This involvement simply means the child is severely absent, and is being tracked as such. (see also PA cohort tracking, above) ```{r} #| label: SA Cohort Tracking volumes #| fig-align: "center" #| fig-height: 3.5 involvements |> filter(cohort_name == "SA Cohort Tracking") |> group_by(exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "Severe Absence Cohort Tracking volumes", subtitle = "involvements starting by academic year; 2025 is part year", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2025)) + coord_cartesian(clip = "off") ``` Children with the SA cohort tracking involvement show a general improvement in attendance levels: ```{r} #| label: plot SA cohort tracking before after attendance #| fig-height: 3.5 inv_summary_long_start |> filter(cohort_name == "SA Cohort Tracking") |> ggplot(aes(x = term_index, y = mean.percent_present ) ) + geom_point(colour = "steel blue") + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Attendance before and after intervention - SA Cohort Tracking", subtitle = "Average percentage of available sessions attended +- 95 CI", x = "half term (relative to involvement start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") ``` ## Parenting The parenting data comes via the Early Help data model, and is mapped onto attendance data like the other involvement types. These are children whose parents are enrolled on parenting programmes. :::{.callout-warning} N.B. the volumes here drop off in 2022, and work is needed to check we're pulling in all available data. ::: ```{r} #| label: parenting volumes & age on first start ##| out-width: 75% ##| fig-height: 3 #| fig-align: "center" #| fig-ncol: 2 # plot volumes involvements |> filter(cohort_name == "Parenting", exam_year_inv_start >= 2019, exam_year_inv_start <= 2024) |> group_by(exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "Parenting - volumes", subtitle = "involvements starting by accademic year; 2025 is part year only", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2025)) + coord_cartesian(clip = "off") parenting_inclusion_inv_stud_id <- involvements |> filter(cohort_name == "Parenting") |> select(stud_id) |> unique() parenting_inclusion_age <- involvements |> filter(cohort_name == "Parenting") |> select(stud_id, cohort_name, open_date) |> left_join(attend |> filter(stud_id %in% parenting_inclusion_inv_stud_id$stud_id) |> select(stud_id, dob) |> unique(), by = "stud_id" ) |> mutate(age = floor(as.numeric( (open_date - dob)/365.25))) |> group_by(age) |> summarise(child_count = n_distinct(stud_id)) ggplot(parenting_inclusion_age, aes(x = age, y = child_count, label = child_count)) + geom_col(fill = "steelblue") + geom_text(colour = "steelblue", size = 3, vjust = -1)+ theme(axis.text.y = eb, axis.title.y = eb, axis.line.y = eb, axis.ticks.y = eb) + labs(title = "Age on involvement start - Parenting", x = "age on involvement open date") #+ #scale_x_continuous(breaks = seq(0:12), limits = c(0,11)) ``` Parenting programme don't appear to have any effect on overall attendance, indeed average levels continue to decline after the involvement start: ```{r} #| label: plot parenting average attendance before and after #| fig-height: 3.5 inv_summary_long_start |> filter(cohort_name == "Parenting") |> ggplot(aes(x = term_index, y = mean.percent_present ) ) + geom_point(colour = "steel blue", size = 2) + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Average attendance before and after Parenting", subtitle = "Mean percentage of available sessions attended +- 95 CI", caption = "data from Capita One", x = "half term (relative to start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") ``` Looking at the coded reasons, there are no shifts here that are not comparable with those seen in the random samples - parenting programmes seem to have no significant effects: ```{r} #| label: plot before & after by reason code Parenting #| fig-height: 4 before_after_year_fun(cohort_name_selection = "Parenting") |> ggplot(aes(x = name, y = mean, fill = prior_post, label = scales::percent(mean, accuracy = 0.01))) + geom_col(position = "dodge") + geom_errorbar(aes(ymax = upper_ci,ymin = lower_ci), width = 0.2, alpha = 0.5, position = position_dodge(width = 0.9), colour = "gray20") + geom_text(position = position_dodge(width = 0.9), vjust = -0.2, size = 3) + scale_fill_met_d("Egypt") + scale_colour_met_d("Egypt") + scale_y_continuous(labels = scales::percent) + barplottheme_minimal + labs(title = "Absence reasons before and after Parenting", subtitle = "Average % of sessions missed one year either side of involvement or episode start date", caption = "data from Capita One") + theme(plot.title = element_markdown(size = 12), legend.position = "none", axis.title = eb, axis.text.y = eb) + scale_x_discrete(labels = function(x) str_wrap(x, width = 2)) ``` ## Other N.B. At the time of writing, I don't know what this actually means. There is an involvement cohort name of simply "other", with over 150 involvements starting in 2024: ```{r} #| label: other volumes & age on first start ##| out-width: 75% ##| fig-height: 3 #| fig-align: "center" #| fig-ncol: 2 # plot volumes involvements |> filter(cohort_name == "Other", exam_year_inv_start >= 2019, exam_year_inv_start <= 2024) |> group_by(exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "Other - volumes", subtitle = "involvements starting by accademic year; 2025 is part year only", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2025)) + coord_cartesian(clip = "off") other_inclusion_inv_stud_id <- involvements |> filter(cohort_name == "Other") |> select(stud_id) |> unique() other_inclusion_age <- involvements |> filter(cohort_name == "Other") |> select(stud_id, cohort_name, open_date) |> left_join(attend |> filter(stud_id %in% other_inclusion_inv_stud_id$stud_id) |> select(stud_id, dob) |> unique(), by = "stud_id" ) |> mutate(age = floor(as.numeric( (open_date - dob)/365.25))) |> group_by(age) |> summarise(child_count = n_distinct(stud_id)) ggplot(other_inclusion_age, aes(x = age, y = child_count, label = child_count)) + geom_col(fill = "steelblue") + geom_text(colour = "steelblue", size = 3, vjust = -1)+ theme(axis.text.y = eb, axis.title.y = eb, axis.line.y = eb, axis.ticks.y = eb) + labs(title = "Age on involvement start - Other", x = "age on involvement open date") #+ #scale_x_continuous(breaks = seq(0:12), limits = c(0,11)) ``` "Other" type involvements see a substantial change in average attendance levels: ```{r} #| label: plot other average attendance before and after #| fig-height: 3.5 inv_summary_long_start |> filter(cohort_name == "Other") |> ggplot(aes(x = term_index, y = mean.percent_present ) ) + geom_point(colour = "steel blue", size = 2) + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Average attendance before and after Other type involvements", subtitle = "Mean percentage of available sessions attended +- 95 CI", caption = "data from Capita One", x = "half term (relative to start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") ``` Looking at the coded reasons, there are no significant year on year shifts associated with the 'Other' type involvements. ```{r} #| label: plot before & after by reason code Other #| fig-height: 4 before_after_year_fun(cohort_name_selection = "Other") |> ggplot(aes(x = name, y = mean, fill = prior_post, label = scales::percent(mean, accuracy = 0.01))) + geom_col(position = "dodge") + geom_errorbar(aes(ymax = upper_ci,ymin = lower_ci), width = 0.2, alpha = 0.5, position = position_dodge(width = 0.9), colour = "gray20") + geom_text(position = position_dodge(width = 0.9), vjust = -0.2, size = 3) + scale_fill_met_d("Egypt") + scale_colour_met_d("Egypt") + scale_y_continuous(labels = scales::percent) + barplottheme_minimal + labs(title = "Absence reasons before and after Other", subtitle = "Average % of sessions missed one year either side of involvement or episode start date", caption = "data from Capita One") + theme(plot.title = element_markdown(size = 12), legend.position = "none", axis.title = eb, axis.text.y = eb) + scale_x_discrete(labels = function(x) str_wrap(x, width = 2)) ``` ## Hearing Impairment Team ```{r} #| label: hearing impairment team volumes & age on first start ##| out-width: 75% ##| fig-height: 3 #| fig-align: "center" #| fig-ncol: 2 # plot volumes involvements |> filter(cohort_name == "HI team", exam_year_inv_start >= 2019, exam_year_inv_start <= 2024) |> group_by(exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "Hearing Impairment Team - volumes", subtitle = "involvements starting by accademic year; 2025 is part year only", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2025)) + coord_cartesian(clip = "off") hi_tm_inclusion_inv_stud_id <- involvements |> filter(cohort_name == "HI team") |> select(stud_id) |> unique() hi_tm_inclusion_age <- involvements |> filter(cohort_name == "HI team") |> select(stud_id, cohort_name, open_date) |> left_join(attend |> filter(stud_id %in% hi_tm_inclusion_inv_stud_id$stud_id) |> select(stud_id, dob) |> unique(), by = "stud_id" ) |> mutate(age = floor(as.numeric( (open_date - dob)/365.25))) |> group_by(age) |> summarise(child_count = n_distinct(stud_id)) ggplot(hi_tm_inclusion_age, aes(x = age, y = child_count, label = child_count)) + geom_col(fill = "steelblue") + geom_text(colour = "steelblue", size = 3, vjust = -1)+ theme(axis.text.y = eb, axis.title.y = eb, axis.line.y = eb, axis.ticks.y = eb) + labs(title = "Age on involvement start - HI team", x = "age on involvement open date") #+ #scale_x_continuous(breaks = seq(0:16), limits = c(0,16)) ``` Involvement with the Hearing Impairment team has no apparent effect on attendance - which for this cohort is healthly throughout. ```{r} #| label: plot HI team average attendance before and after #| fig-height: 3.5 inv_summary_long_start |> filter(cohort_name == "HI team") |> ggplot(aes(x = term_index, y = mean.percent_present ) ) + geom_point(colour = "steel blue", size = 2) + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Average attendance before and after Hearing Impairment team", subtitle = "Mean percentage of available sessions attended +- 95 CI", caption = "data from Capita One", x = "half term (relative to start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") ``` The analysis of coded reasons shows no changes that are not comparable with the background changes seen in the random samples, and so is not included here. ```{r} #| label: plot before & after by reason code HI team #| fig-height: 4 #| eval: false before_after_year_fun(cohort_name_selection = "HI team") |> ggplot(aes(x = name, y = mean, fill = prior_post, label = scales::percent(mean, accuracy = 0.01))) + geom_col(position = "dodge") + geom_errorbar(aes(ymax = upper_ci,ymin = lower_ci), width = 0.2, alpha = 0.5, position = position_dodge(width = 0.9), colour = "gray20") + geom_text(position = position_dodge(width = 0.9), vjust = -0.2, size = 3) + scale_fill_met_d("Egypt") + scale_colour_met_d("Egypt") + scale_y_continuous(labels = scales::percent) + barplottheme_minimal + labs(title = "Absence reasons before and after Hearing Impairment team", subtitle = "Average % of sessions missed one year either side of involvement or episode start date", caption = "data from Capita One") + theme(plot.title = element_markdown(size = 12), legend.position = "none", axis.title = eb, axis.text.y = eb) + scale_x_discrete(labels = function(x) str_wrap(x, width = 2)) ``` Taking children with a primary specific need of Hearing Impairment, we can do a 'with & without team involvement' analysis. This shows consistently lower attendance through all school years, for children with the team involvement. Presumably this reflects differences in severity of need within the cohort of hearing impaired children. ```{r} #| label: HI team cohort comparison - attendance # filter attendance data to just get SEN kids Y5 and below comp_HI <- attend |> filter(year >= 2021) |> filter(primary_specific_need == #"SEN Support" #, "Hearing Impairment" ) inv_HI <- involvements |> filter(cohort_name == "HI team") |> select(stud_id) |> mutate(inv_flag = 1) |> distinct() # join involvements to attendance data comp_HI <- comp_HI |> left_join(inv_HI, by = "stud_id") |> mutate(inv_flag = case_when( is.na(inv_flag) ~ 0,TRUE ~ inv_flag)) |> distinct() comp_HI_summary <- comp_HI |> group_by(inv_flag, ncy) |> summarise (mean.percent_present = mean(percent_present, na.rm = TRUE), sd.percent_present = sd(percent_present, na.rm = TRUE), n.percent_present = n() ) |> mutate(se.percent_present = sd.percent_present / sqrt(n.percent_present), lower.ci.percent_present = mean.percent_present - qt(1 - (0.05 / 2), n.percent_present - 1) * se.percent_present, upper.ci.percent_present = mean.percent_present + qt(1 - (0.05 / 2), n.percent_present - 1) * se.percent_present) # plot ggplot(comp_HI_summary |> filter(ncy > 0), aes(x = ncy, y = mean.percent_present, colour = factor(inv_flag))) + geom_point() + geom_line(linetype = "dotted") + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2)+ geom_text(aes(label = scales::percent(round(mean.percent_present,3), accuracy = 0.1)), vjust = 3, colour = "white", size = 3, position = position_dodge(0.9)) + labs(title = "Attendance by curriculum year without and with Hearing Impairment team involvement", subtitle = "Pupils with a primary specific need of hearing impairment; avg % of available sessions attended +- 95 CI", x = "new curriculum year")+ barplottheme_minimal + theme(axis.title.x = eb, plot.title = element_markdown(size = 12), legend.position = "none" ) + scale_x_continuous(breaks = seq(0,11)) + scale_y_continuous(labels = scales::percent) + MetBrewer::scale_fill_met_d("Egypt") ``` ## Inclusion & Attendance Y4 / Inclusion & Attendance Y9 ### I&A Y4 The Inclusion & Attendance Y4 team get involved from Y4 onwards in order to assist with the transition to secondary school in Y6-7 which is generally associated with a significant drop in attendance. Activity is fairly low in recent years, with only about 40 involvements starting each year. ```{r} #| label: I A y4 volumes & age on first start #| fig-align: "center" #| layout-ncol: 2 # plot volumes involvements |> filter(cohort_name == "I&A - Y4", exam_year_inv_start >= 2019, exam_year_inv_start <= 2024) |> group_by(exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "I&A - Y4 volumes", subtitle = "involvements starting by accademic year; 2025 is part year only", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2025)) + coord_cartesian(clip = "off") ia_y4_inclusion_inv_stud_id <- involvements |> filter(cohort_name == "I&A - Y4") |> select(stud_id) |> unique() ia_y4_inclusion_age <- involvements |> filter(cohort_name == "I&A - Y4") |> select(stud_id, cohort_name, open_date) |> left_join(attend |> filter(stud_id %in% ia_y4_inclusion_inv_stud_id$stud_id) |> select(stud_id, dob) |> unique(), by = "stud_id" ) |> mutate(age = floor(as.numeric( (open_date - dob)/365.25))) |> group_by(age) |> summarise(child_count = n_distinct(stud_id)) ggplot(ia_y4_inclusion_age, aes(x = age, y = child_count, label = child_count)) + geom_col(fill = "steelblue") + geom_text(colour = "steelblue", size = 3, vjust = -1)+ theme(axis.text.y = eb, axis.title.y = eb, axis.line.y = eb, axis.ticks.y = eb) + labs(title = "Age on involvement start - Inclusion & Attendance Y4 team", x = "age on involvement open date") #+ #scale_x_continuous(breaks = seq(0:16), limits = c(0,16)) ``` The team work more with children in more deprived areas, and those who require SEN support. ```{r} #| label: table of I A Y4 team count by characteristics #| fig-align: centre #| layout: [45, -10, 45] involvements |> filter( ##exam_year_inv_start == 2024, cohort_name == "I&A - Y4", !is.na(sen_level)) |> group_by( #sen_need_category sen_level ) |> summarise(child_count = n_distinct(stud_id)) |> arrange(desc(child_count)) |> gt(rowname_col = "sen_level") |> tab_header( title = "Inclusion & Attendance Y4 team", subtitle = "counts of children with involvements in 2024, by SEN level" ) |> tab_options( table.align = "left", heading.title.font.size = 12, heading.subtitle.font.size= 10, heading.align = "left", column_labels.font.size = 14, stub.font.size = 12, column_labels.hidden = TRUE ) |> cols_align("left",'sen_level') involvements |> filter(cohort_name == "I&A - Y4", !is.na(imd_quartile)) |> group_by(imd_quartile) |> summarise(child_count = n_distinct(stud_id)) |> gt(rowname_col = "sen_level") |> tab_header( title = "Inclusion & Attendance Y4 team - counts by IMD quartile", subtitle = "4 = most deprived wards; 1 = least deprived" ) |> tab_options( table.align = "left", heading.title.font.size = 12, heading.subtitle.font.size= 10, heading.align = "left", column_labels.font.size = 14, stub.font.size = 12, column_labels.hidden = TRUE ) |> cols_align("left",'imd_quartile') ``` And the children who the I&A Y4 team work with tend to have lower attendance throughout primary school ```{r} #| label: plot primary attendance with and without I A Y4 team involvement #| fig-height: 3.5 y4_inv_stud_id <- involvements |> filter(cohort_name == "I&A - Y4") |> select(stud_id) |> distinct() |> mutate(y4_team = TRUE) attend_pri <- attend |> filter( ncy %in% c(1,2,3,4,5,6) & year >= 2021 ) y4_primary <- attend_pri |> left_join(y4_inv_stud_id, by = "stud_id") |> mutate(y4_team = if_else(is.na(y4_team),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","y4_team")) |> group_by(ncy,y4_team) |> summarise_avg() |> mutate(category = "overall") ggplot(y4_primary, aes(x = ncy, y = mean.percent_present, colour = y4_team, group = y4_team) ) + geom_point() + geom_line(alpha = 0.7 ) + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2)+ geom_vline(aes(xintercept = 6.5), linetype = "dotted", colour = "gray") + labs(title = "Attendance through primary - pupils without and with I&A Y4 involvement", subtitle = "Average percentage of available sessions attended since 2021; +- 95 CI", caption = "data from Capita One") + barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(labels = seq(1,6), breaks = seq(1,6)) + theme(axis.title.x = eb, legend.position = "none", strip.background = eb, strip.text = element_text(size = 8), plot.title = element_markdown(size = 12)) + MetBrewer::scale_fill_met_d("Egypt") ``` Since the team's aim is to assist with the transition to secondary school (Y6 to Y7), it makes sense to compare this transition point for children who have involvement with those who do not. We'll do so across a some other factors, like SEN levels, deprivation and prior attendance levels. The charts below show this analysis from years 4 to 8 for all pupils, those with SEN support, EHCP, those in the most deprived wards of the city, and those who were severely absent in Y4 or in Y6: ```{r} #| label: I A Y4 effectiveness by characteristic group #| fig-height: 8 attend_y7 <- attend |> filter( ncy %in% c(4,5,6,7,8) & year >= 2021 ) y7_overall <- attend_y7 |> left_join(y4_inv_stud_id, by = "stud_id") |> mutate(y4_team = if_else(is.na(y4_team),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","y4_team")) |> group_by(ncy,y4_team) |> summarise_avg() |> mutate(category = "all pupils") y7_deprived <- attend_y7 |> left_join(y4_inv_stud_id, by = "stud_id") |> filter(ward_imd_score >= 40 ) |> mutate(y4_team = if_else(is.na(y4_team),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","y4_team")) |> group_by(ncy,y4_team) |> summarise_avg() |> mutate(category = "most deprived wards") y7_sen_support <- attend_y7 |> left_join(y4_inv_stud_id, by = "stud_id") |> filter(sen_level == "SEN Support" ) |> mutate(y4_team = if_else(is.na(y4_team),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","y4_team")) |> group_by(ncy,y4_team) |> summarise_avg() |> mutate(category = "SEN Support") y7_ehcp <- attend_y7 |> left_join(y4_inv_stud_id, by = "stud_id") |> filter(sen_level == "EHCP" ) |> mutate(y4_team = if_else(is.na(y4_team),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","y4_team")) |> group_by(ncy,y4_team) |> summarise_avg() |> mutate(category = "EHCP") sa_y4 <- attend |> filter(ncy == 4, severe_absence == 1) |> select(stud_id) |> distinct() y7_sa_y4 <- attend_y7 |> left_join(y4_inv_stud_id, by = "stud_id") |> filter(stud_id %in% sa_y4$stud_id) |> mutate(y4_team = if_else(is.na(y4_team),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","y4_team")) |> group_by(ncy,y4_team) |> summarise_avg() |> mutate(category = "severely absent y4") sa_y6 <- attend |> filter(ncy == 6, severe_absence == 1) |> select(stud_id) |> distinct() y7_sa_y6 <- attend_y7 |> left_join(y4_inv_stud_id, by = "stud_id") |> filter(stud_id %in% sa_y6$stud_id) |> mutate(y4_team = if_else(is.na(y4_team),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","y4_team")) |> group_by(ncy,y4_team) |> summarise_avg() |> mutate(category = "severely absent y6") rbind( y7_overall, y7_sen_support, y7_deprived, y7_ehcp, y7_sa_y4, y7_sa_y6 ) |> mutate(category = factor(category, levels = c("all pupils", "most deprived wards", "SEN Support", "EHCP", "severely absent y4", "severely absent y6") )) |> ggplot(aes(x = ncy, y = mean.percent_present, colour = y4_team, group = y4_team)) + geom_point() + geom_line(alpha = 0.7 ) + #linetype = "dotted") + #geom_text(aes(label = n.percent_present)) + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2)+ geom_text(aes(label = scales::percent(round(mean.percent_present,3))), vjust = 2, colour = "white", size = 4, position = position_dodge(0.9)) + geom_vline(aes(xintercept = 6.5), linetype = "dotted", colour = "gray") + facet_wrap(vars(category)) + labs(title = "Attendance through years 4 to 8 - pupils without and with I&A Y4 involvement", subtitle = "Average percentage of available sessions attended since 2021; +- 95 CI", caption = "data from Capita One") + barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(labels = seq(4,8), breaks = seq(4,8)) + theme(axis.title.x = eb, legend.position = "none", strip.background = eb, strip.text = element_text(size = 8), plot.title = element_markdown(size = 12)) + MetBrewer::scale_fill_met_d("Egypt") ``` The above plots show that attendance throughout years 4 to 7 is consistently lower for all children the team work with. And looking at the transition to secondary school for all pupils on average, for children on SEN support and for children in the most deprived wards, the _drop_ from Y6 to 7 is more severe among children the team work with. This presumably represents the targetting of this service to children with the highest needs, according to factors not visible in the data. For children with an EHC plan, and children who were severely absent in Y4 or Y7, when the team are involved attendance actually _improves_ into Y7. The plots below show the net difference in overall % attendance between Y6 and y7 for the same groups given above, again split by whether the I&A Y4 team were involved: ```{r} #| label: I A Y4 y6 to y7 change by characteristic group ##| fig-height: 8 attend_6_7 <- attend |> filter( ncy %in% c(6,7) & year >= 2021 ) a1 <- attend_6_7 |> filter(stud_id == 298660) |> select(stud_id, ncy) a2 <- attend |> filter(stud_id == 292393) |> select(stud_id, ncy, year) change_6_7_overall <- attend_6_7 |> left_join(y4_inv_stud_id, by = "stud_id") |> mutate(y4_team = if_else(is.na(y4_team),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","y4_team")) |> select(stud_id, y4_team, ncy, percent_present) |> pivot_wider(names_from = ncy, values_from = percent_present) |> mutate(percent_present = `7` - `6`) |> select(-`7`, -`6`) |> group_by(y4_team) |> summarise_avg() |> mutate(category = "all pupils") change_6_7_deprived <- attend_6_7 |> filter(ward_imd_score >= 40 ) |> left_join(y4_inv_stud_id, by = "stud_id") |> mutate(y4_team = if_else(is.na(y4_team),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","y4_team")) |> select(stud_id, y4_team, ncy, percent_present) |> pivot_wider(names_from = ncy, values_from = percent_present) |> mutate(percent_present = `7` - `6`) |> select(-`7`, -`6`) |> group_by(y4_team) |> summarise_avg() |> mutate(category = "most deprived wards") change_6_7_sen_support <- attend_6_7 |> filter(sen_level == "SEN Support" ) |> left_join(y4_inv_stud_id, by = "stud_id") |> mutate(y4_team = if_else(is.na(y4_team),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","y4_team")) |> select(stud_id, y4_team, ncy, percent_present) |> pivot_wider(names_from = ncy, values_from = percent_present) |> mutate(percent_present = `7` - `6`) |> select(-`7`, -`6`) |> group_by(y4_team) |> summarise_avg() |> mutate(category = "SEN Support") change_6_7_ehcp <- attend_6_7 |> filter(sen_level == "EHCP" ) |> left_join(y4_inv_stud_id, by = "stud_id") |> mutate(y4_team = if_else(is.na(y4_team),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","y4_team")) |> select(stud_id, y4_team, ncy, percent_present) |> pivot_wider(names_from = ncy, values_from = percent_present) |> mutate(percent_present = `7` - `6`) |> select(-`7`, -`6`) |> group_by(y4_team) |> summarise_avg() |> mutate(category = "EHCP") change_6_7_sa_y4 <- attend_6_7 |> filter(stud_id %in% sa_y4$stud_id) |> left_join(y4_inv_stud_id, by = "stud_id") |> mutate(y4_team = if_else(is.na(y4_team),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","y4_team")) |> select(stud_id, y4_team, ncy, percent_present) |> pivot_wider(names_from = ncy, values_from = percent_present) |> mutate(percent_present = `7` - `6`) |> select(-`7`, -`6`) |> group_by(y4_team) |> summarise_avg() |> mutate(category = "severely absent y4") change_6_7_sa_y6 <- attend_6_7 |> filter(stud_id %in% sa_y6$stud_id) |> left_join(y4_inv_stud_id, by = "stud_id") |> mutate(y4_team = if_else(is.na(y4_team),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","y4_team")) |> select(stud_id, y4_team, ncy, percent_present) |> pivot_wider(names_from = ncy, values_from = percent_present) |> mutate(percent_present = `7` - `6`) |> select(-`7`, -`6`) |> group_by(y4_team) |> summarise_avg() |> mutate(category = "severely absent y6") rbind( change_6_7_overall, change_6_7_deprived, change_6_7_sen_support, change_6_7_ehcp, change_6_7_sa_y4, change_6_7_sa_y6 ) |> mutate(category = factor(category, levels = c("all pupils", "most deprived wards", "SEN Support", "EHCP", "severely absent y4", "severely absent y6") )) |> ggplot(aes(x = category, y = mean.percent_present, fill = y4_team)) + geom_col(position = position_dodge(0.9)) + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, position = position_dodge(0.9)) + geom_vline(aes(xintercept = 6.5), linetype = "dotted", colour = "gray") + labs(title = "Average Y6 to Y7 change for - pupils without and with I&A Y4 involvement, for various groups", subtitle = "Avg % of available sessions attended +- 95 CI; 2021 onwards", caption = "data from Capita One") + barplottheme_minimal + theme(axis.title.x = eb, axis.text.y = eb, legend.position = "none", strip.background = eb, strip.text = element_text(size = 8), plot.title = element_markdown(size = 12)) + MetBrewer::scale_fill_met_d("Egypt") ``` We should also look at exclusion rates for the Y4 team ```{r} #| label: plot I A Y4 exclusions #| fig-height: 3.5 inv_summary_long_start |> filter(cohort_name == "I&A - Y4") |> ggplot(aes(x = term_index, y = mean.percent_excluded) ) + geom_point(colour = "dark red") + geom_errorbar(aes(ymin = lower.ci.percent_excluded, ymax = upper.ci.percent_excluded), width = 0.2, colour = "dark red")+ labs(title = "Exclusion before and after intervention - I&A - Y4", subtitle = "Average percentage of available sessions where the child was excluded +- 95 CI", x = "half term (relative to involvement start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") + geom_hline(yintercept = excl_2023_avg, linetype = "dashed", colour = "dark grey") + geom_text(aes(x = 4, label = paste("2023 avg",scales::percent(excl_2023_avg, accuracy = 0.1L)), y = excl_2023_avg), size = 2.5, vjust = 1, colour = "dark grey") ``` ### I&A - Y9 ```{r} #| label: I A y9 volumes & age on first start ##| out-width: 75% ##| fig-height: 3 #| fig-align: "center" #| fig-ncol: 2 # plot volumes involvements |> filter(cohort_name == "I&A - Y9", exam_year_inv_start >= 2019, exam_year_inv_start <= 2024) |> group_by(exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "I&A - Y9 volumes", subtitle = "involvements starting by accademic year; 2025 is part year only", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2025)) + coord_cartesian(clip = "off") ia_y9_inclusion_inv_stud_id <- involvements |> filter(cohort_name == "I&A - Y9") |> select(stud_id) |> unique() ia_y9_inclusion_age <- involvements |> filter(cohort_name == "I&A - Y9") |> select(stud_id, cohort_name, open_date) |> left_join(attend |> filter(stud_id %in% ia_y9_inclusion_inv_stud_id$stud_id) |> select(stud_id, dob) |> unique(), by = "stud_id" ) |> mutate(age = floor(as.numeric( (open_date - dob)/365.25))) |> group_by(age) |> summarise(child_count = n_distinct(stud_id)) ggplot(ia_y9_inclusion_age, aes(x = age, y = child_count, label = child_count)) + geom_col(fill = "steelblue") + geom_text(colour = "steelblue", size = 3, vjust = -1)+ theme(axis.text.y = eb, axis.title.y = eb, axis.line.y = eb, axis.ticks.y = eb) + labs(title = "Age on involvement start - Inclusion & Attendance Y9 team", x = "age on involvement open date") #+ #scale_x_continuous(breaks = seq(0:16), limits = c(0,16)) ``` Overall, children involved with the team show lower attendance through secondary school than those without: ```{r} #| label: plot secondary attendance with and without I A Y9 team involvement #| fig-height: 3.5 y9_inv_stud_id <- involvements |> filter(cohort_name == "I&A - Y9") |> select(stud_id) |> distinct() |> mutate(y9_team = TRUE) attend_pri <- attend |> filter( ncy %in% c(7,8,9,10,11) & year >= 2021 ) y9_primary <- attend_pri |> left_join(y9_inv_stud_id, by = "stud_id") |> mutate(y9_team = if_else(is.na(y9_team),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","y9_team")) |> group_by(ncy,y9_team) |> summarise_avg() |> mutate(category = "overall") ggplot(y9_primary, aes(x = ncy, y = mean.percent_present, colour = y9_team, group = y9_team) ) + geom_point() + geom_line(alpha = 0.7 ) + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2)+ #geom_vline(aes(xintercept = 6.5), linetype = "dotted", colour = "gray") + labs(title = "Attendance through secondary school - pupils without and with I&A Y9 involvement", subtitle = "Average percentage of available sessions attended since 2021; +- 95 CI", caption = "data from Capita One") + barplottheme_minimal + theme(axis.title.x = eb, #axis.text.y = eb, legend.position = "none", strip.background = eb, strip.text = element_text(size = 8), plot.title = element_markdown(size = 12)) + MetBrewer::scale_fill_met_d("Egypt") + scale_y_continuous(labels = scales::percent) ``` We'll take the same approach for the Y9 team as we did for Y4 - comparing attendance with & without the team across a number of characteristic groups. This shows a similar pattern to what we saw with the I&A Y4 team - at the Y9 to Y10 boundary, children involved with the team show a more severe drop in attendance than those without, across all the groups we've looked at here. The unfortunate conclusion is that we can't say much about the team's effectiveness from our data, and we have to assume that there are factors at work that are not present in the data. ```{r} #| label: I A Y9 effectiveness by characteristic group #| fig-height: 8 attend_y8 <- attend |> filter( ncy %in% c(8,9,10,11) & year >= 2021 ) y8_overall <- attend_y8 |> left_join(y9_inv_stud_id, by = "stud_id") |> mutate(y9_team = if_else(is.na(y9_team),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","y9_team")) |> group_by(ncy,y9_team) |> summarise_avg() |> mutate(category = "all pupils") y8_deprived <- attend_y8 |> left_join(y9_inv_stud_id, by = "stud_id") |> filter(ward_imd_score >= 40 ) |> mutate(y9_team = if_else(is.na(y9_team),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","y9_team")) |> group_by(ncy,y9_team) |> summarise_avg() |> mutate(category = "most deprived wards") y8_sen_support <- attend_y8 |> left_join(y9_inv_stud_id, by = "stud_id") |> filter(sen_level == "SEN Support" ) |> mutate(y9_team = if_else(is.na(y9_team),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","y9_team")) |> group_by(ncy,y9_team) |> summarise_avg() |> mutate(category = "SEN Support") y8_ehcp <- attend_y8 |> left_join(y9_inv_stud_id, by = "stud_id") |> filter(sen_level == "EHCP" ) |> mutate(y9_team = if_else(is.na(y9_team),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","y9_team")) |> group_by(ncy,y9_team) |> summarise_avg() |> mutate(category = "EHCP") sa_y7 <- attend |> filter(ncy == 7, severe_absence == 1) |> select(stud_id) |> distinct() y8_sa_y7 <- attend_y8 |> left_join(y9_inv_stud_id, by = "stud_id") |> filter(stud_id %in% sa_y7$stud_id) |> mutate(y9_team = if_else(is.na(y9_team),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","y9_team")) |> group_by(ncy,y9_team) |> summarise_avg() |> mutate(category = "severely absent y7") sa_y9 <- attend |> filter(ncy == 9, severe_absence == 1) |> select(stud_id) |> distinct() y8_sa_y9 <- attend_y8 |> left_join(y9_inv_stud_id, by = "stud_id") |> filter(stud_id %in% sa_y9$stud_id) |> mutate(y9_team = if_else(is.na(y9_team),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","y9_team")) |> group_by(ncy,y9_team) |> summarise_avg() |> mutate(category = "severely absent y9") rbind( y8_overall, y8_sen_support, y8_deprived, y8_ehcp, y8_sa_y7, y8_sa_y9 ) |> mutate(category = factor(category, levels = c("all pupils", "most deprived wards", "SEN Support", "EHCP", "severely absent y7", "severely absent y9") )) |> ggplot(aes(x = ncy, y = mean.percent_present, colour = y9_team, group = y9_team)) + geom_point() + geom_line(alpha = 0.7 ) + #linetype = "dotted") + geom_text(aes(label = scales::percent(mean.percent_present, accuracy = 0.1)), size = 3, nudge_y = -0.04) + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, alpha = 0.7)+ geom_text(aes(label = scales::percent(round(mean.percent_present,3))), vjust = 2, colour = "white", size = 4, position = position_dodge(0.9)) + facet_wrap(vars(category)) + labs(title = "Attendance through years 8 to 11 - pupils without and with I&A y9 involvement", subtitle = "Average percentage of available sessions attended since 2021; +- 95 CI", caption = "data from Capita One") + barplottheme_minimal + theme(axis.title.x = eb,# axis.text.y = eb, legend.position = "none", strip.background = eb, strip.text = element_text(size = 8), plot.title = element_markdown(size = 12)) + MetBrewer::scale_fill_met_d("Egypt") ``` Exclusion rates beofre & after the I&A Y9 team involvement ```{r} #| label: plot I A Y9 exclusions #| fig-height: 3.5 inv_summary_long_start |> filter(cohort_name == "I&A - Y9") |> ggplot(aes(x = term_index, y = mean.percent_excluded) ) + geom_point(colour = "dark red") + geom_errorbar(aes(ymin = lower.ci.percent_excluded, ymax = upper.ci.percent_excluded), width = 0.2, colour = "dark red")+ labs(title = "Exclusion before and after intervention - I&A - Y9", subtitle = "Average percentage of available sessions where the child was excluded +- 95 CI", x = "half term (relative to involvement start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") + geom_hline(yintercept = excl_2023_avg, linetype = "dashed", colour = "dark grey") + geom_text(aes(x = 4, label = paste("2023 avg",scales::percent(excl_2023_avg, accuracy = 0.1L)), y = excl_2023_avg), size = 2.5, vjust = 1, colour = "dark grey") ``` ## I&A - Complex SEND ```{r} #| label: I A complex send volumes & age on first start ##| out-width: 75% ##| fig-height: 3 #| fig-align: "center" #| fig-ncol: 2 # plot volumes involvements |> filter(cohort_name == "I&A - Complex SEND", exam_year_inv_start >= 2019, exam_year_inv_start <= 2024) |> group_by(exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "I&A - Complex SEND volumes", subtitle = "involvements starting by accademic year; 2025 is part year only", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2025)) + coord_cartesian(clip = "off") ia_cs_inclusion_inv_stud_id <- involvements |> filter(cohort_name == "I&A - Complex SEND") |> select(stud_id) |> unique() ia_cs_inclusion_age <- involvements |> filter(cohort_name == "I&A - Complex SEND") |> select(stud_id, cohort_name, open_date) |> left_join(attend |> filter(stud_id %in% ia_cs_inclusion_inv_stud_id$stud_id) |> select(stud_id, dob) |> unique(), by = "stud_id" ) |> mutate(age = floor(as.numeric( (open_date - dob)/365.25))) |> group_by(age) |> summarise(child_count = n_distinct(stud_id)) ggplot(ia_cs_inclusion_age, aes(x = age, y = child_count, label = child_count)) + geom_col(fill = "steelblue") + geom_text(colour = "steelblue", size = 3, vjust = -1)+ theme(axis.text.y = eb, axis.title.y = eb, axis.line.y = eb, axis.ticks.y = eb) + labs(title = "Age on involvement start - Inclusion & Attendance Complex SEND", x = "age on involvement open date") #+ #scale_x_continuous(breaks = seq(0:16), limits = c(0,16)) ``` :::{.callout-note} Although the team is called 'Complex SEND' almost half of the children involved with the team do not show up as having any special educational needs, according to the school census data. ::: ```{r} #| label: I A complex send table of volumes by sen level and primary need #| layout: [45, -10, 45] involvements |> filter( cohort_name == "I&A - Complex SEND", !is.na(sen_level)) |> group_by( #sen_need_category sen_level ) |> summarise(child_count = n_distinct(stud_id)) |> arrange(desc(child_count)) |> gt(rowname_col = "sen_level") |> tab_header( title = "Inclusion & Attendance Complex SEND", subtitle = "counts of children with involvements in 2024, by SEN level" ) |> tab_options( table.align = "left", heading.title.font.size = 12, heading.subtitle.font.size= 10, heading.align = "left", column_labels.font.size = 14, stub.font.size = 12, column_labels.hidden = TRUE ) |> cols_align("left",'sen_level') involvements |> filter( cohort_name == "I&A - Complex SEND", !is.na(sen_level)) |> group_by( primary_specific_need ) |> summarise(child_count = n_distinct(stud_id)) |> arrange(desc(child_count)) |> gt(rowname_col = "sen_level") |> tab_header( title = "Inclusion & Attendance Complex SEND", subtitle = "counts of children with involvements in 2024, by SEN level" ) |> tab_options( table.align = "left", heading.title.font.size = 12, heading.subtitle.font.size= 10, heading.align = "left", column_labels.font.size = 14, stub.font.size = 12, column_labels.hidden = TRUE ) |> cols_align("left",'primary_specific_need') ``` The I&A complex SEND team involvement shows a sustained improvement in attendance levels: ```{r} #| label: plot I A Complex SEND before after attendance #| fig-height: 3.5 inv_summary_long_start |> filter(cohort_name == "I&A - Complex SEND") |> ggplot(aes(x = term_index, y = mean.percent_present ) ) + geom_point(colour = "steel blue") + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Attendance before and after intervention - I&A - Complex SEND", subtitle = "Average percentage of available sessions attended +- 95 CI", x = "half term (relative to involvement start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") ``` Although the direction of travel changes for average attendance, when we look at coded absences for the year either side of the involvement start, overall absences increase. Though it's worth considering that children with complex SEND show a general decline in average attendance beyond the decline for other children. There are also small improvements in lateness, illness & exclusion levels here: ```{r} #| label: plot before & after by reason I A Complex SEND #| fig-height: 4 before_after_year_fun(cohort_name_selection = "I&A - Complex SEND") |> ggplot(aes(x = name, y = mean, fill = prior_post, label = scales::percent(mean, accuracy = 0.01))) + geom_col(position = "dodge") + geom_errorbar(aes(ymax = upper_ci,ymin = lower_ci), width = 0.2, alpha = 0.5, position = position_dodge(width = 0.9), colour = "gray20") + geom_text(position = position_dodge(width = 0.9), vjust = -0.2, size = 3) + scale_fill_met_d("Egypt") + scale_colour_met_d("Egypt") + scale_y_continuous(labels = scales::percent) + barplottheme_minimal + labs(title = "Absence reasons before and after I&A Complex SEND", subtitle = "Average % of sessions missed one year either side of involvement or episode start date", caption = "data from Capita One") + theme(plot.title = element_markdown(size = 12), legend.position = "none", axis.title = eb, axis.text.y = eb) + scale_x_discrete(labels = function(x) str_wrap(x, width = 2)) ``` We've also looked at a 'with & without' analysis, for both children with SEN support and an EHC plan. In both cases, children involved with the team have consistently lower attendance than those without. This must reflect the 'complex' nature of the cohort's needs. ```{r} #| label: complex SEND cohort comparison - sen support attendance # filter attendance data to just get SEN support kids Y5 and below comp_CS <- attend |> filter(year >= 2021) |> filter(sen_level == "SEN Support") inv_CS <- involvements |> filter(cohort_name == "I&A - Complex SEND") |> select(stud_id) |> mutate(inv_flag = 1) |> distinct() # join involvements to attendance data comp_CS <- comp_CS |> left_join(inv_CS, by = "stud_id") |> mutate(inv_flag = case_when( is.na(inv_flag) ~ 0,TRUE ~ inv_flag)) |> distinct() comp_CS_summary <- comp_CS |> group_by(inv_flag, ncy) |> summarise (mean.percent_present = mean(percent_present, na.rm = TRUE), sd.percent_present = sd(percent_present, na.rm = TRUE), n.percent_present = n() ) |> mutate(se.percent_present = sd.percent_present / sqrt(n.percent_present), lower.ci.percent_present = mean.percent_present - qt(1 - (0.05 / 2), n.percent_present - 1) * se.percent_present, upper.ci.percent_present = mean.percent_present + qt(1 - (0.05 / 2), n.percent_present - 1) * se.percent_present) # plot ggplot(comp_CS_summary |> filter(ncy >= 0), aes(x = ncy, y = mean.percent_present, colour = factor(inv_flag))) + geom_point() + geom_line(linetype = "dotted") + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2)+ geom_text(aes(label = scales::percent(round(mean.percent_present,3), accuracy = 0.1)), vjust = 3, colour = "white", size = 3, position = position_dodge(0.9)) + labs(title = "Attendance by curriculum year - children requiring SEN Support without and with I&A - Complex SEND involvement", subtitle = "percentage of available sessions attended +- 95 CI", x = "new curriculum year")+ barplottheme_minimal + theme(axis.title.x = eb, plot.title = element_markdown(size = 10), legend.position = "none" ) + scale_x_continuous(breaks = seq(0,11)) + scale_y_continuous(labels = scales::percent) + MetBrewer::scale_fill_met_d("Egypt") ``` ```{r} #| label: Complex SEND cohort comparison - ehcp support attendance # filter attendance data to just get ehcp kids Y5 and below comp_CS <- attend |> filter(year >= 2021) |> filter(sen_level == "EHCP") inv_CS <- involvements |> filter(cohort_name == "I&A - Complex SEND") |> select(stud_id) |> mutate(inv_flag = 1) |> distinct() # join involvements to attendance data comp_CS <- comp_CS |> left_join(inv_CS, by = "stud_id") |> mutate(inv_flag = case_when( is.na(inv_flag) ~ 0,TRUE ~ inv_flag)) |> distinct() comp_CS_summary <- comp_CS |> group_by(inv_flag, ncy) |> summarise (mean.percent_present = mean(percent_present, na.rm = TRUE), sd.percent_present = sd(percent_present, na.rm = TRUE), n.percent_present = n() ) |> mutate(se.percent_present = sd.percent_present / sqrt(n.percent_present), lower.ci.percent_present = mean.percent_present - qt(1 - (0.05 / 2), n.percent_present - 1) * se.percent_present, upper.ci.percent_present = mean.percent_present + qt(1 - (0.05 / 2), n.percent_present - 1) * se.percent_present) # plot ggplot(comp_CS_summary |> filter(ncy >= 0), aes(x = ncy, y = mean.percent_present, colour = factor(inv_flag))) + geom_point() + geom_line(linetype = "dotted") + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2)+ geom_text(aes(label = scales::percent(round(mean.percent_present,3), accuracy = 0.1)), vjust = 3, colour = "white", size = 3, position = position_dodge(0.9)) + labs(title = "Attendance by curriculum year, children with an EHC Plan without and with Early Years Inclusion team involvement", subtitle = "percentage of available sessions attended +- 95 CI", x = "new curriculum year")+ barplottheme_minimal + theme(#axis.title.x = eb, #axis.text.y = eb, plot.title = element_markdown(size = 10), legend.position = "none" ) + scale_x_continuous(breaks = seq(0,11)) + scale_y_continuous(labels = scales::percent) + MetBrewer::scale_fill_met_d("Egypt") ``` Exclusions ```{r} #| label: plot I A complex SEND exclusions #| fig-height: 3.5 inv_summary_long_start |> filter(cohort_name == "I&A - Complex SEND") |> ggplot(aes(x = term_index, y = mean.percent_excluded) ) + geom_point(colour = "dark red") + geom_errorbar(aes(ymin = lower.ci.percent_excluded, ymax = upper.ci.percent_excluded), width = 0.2, colour = "dark red")+ labs(title = "Exclusion before and after intervention - I&A Complex SEND", subtitle = "Average percentage of available sessions where the child was excluded +- 95 CI", x = "half term (relative to involvement start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") + geom_hline(yintercept = excl_2023_avg, linetype = "dashed", colour = "dark grey") + geom_text(aes(x = 4, label = paste("2023 avg",scales::percent(excl_2023_avg, accuracy = 0.1L)), y = excl_2023_avg), size = 2.5, vjust = 1, colour = "dark grey") ``` ## Non PIP/SIP - Think for the Future Think for the future is focussed on behaviour, resilience & inclusion. There are around 100 of these involvements each year, for children of all school ages, but mostly around age 11. ```{r} #| label: non pip sip tftf volumes & age on first start ##| out-width: 75% ##| fig-height: 3 #| fig-align: "center" #| fig-ncol: 2 # plot volumes involvements |> filter(cohort_name == "Non PIP/SIP - Think for the Future", exam_year_inv_start >= 2019, exam_year_inv_start <= 2024) |> group_by(exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "I&A - Y9 volumes", subtitle = "involvements starting by accademic year; 2025 is part year only", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2025)) + coord_cartesian(clip = "off") tftf_inclusion_inv_stud_id <- involvements |> filter(cohort_name == "Non PIP/SIP - Think for the Future") |> select(stud_id) |> unique() tftf_inclusion_age <- involvements |> filter(cohort_name == "Non PIP/SIP - Think for the Future") |> select(stud_id, cohort_name, open_date) |> left_join(attend |> filter(stud_id %in% tftf_inclusion_inv_stud_id$stud_id) |> select(stud_id, dob) |> unique(), by = "stud_id" ) |> mutate(age = floor(as.numeric( (open_date - dob)/365.25))) |> group_by(age) |> summarise(child_count = n_distinct(stud_id)) ggplot(tftf_inclusion_age, aes(x = age, y = child_count, label = child_count)) + geom_col(fill = "steelblue") + geom_text(colour = "steelblue", size = 3, vjust = -1)+ theme(axis.text.y = eb, axis.title.y = eb, axis.line.y = eb, axis.ticks.y = eb) + labs(title = "Age on involvement start - Non PIP/SIP - Think for the Future", x = "age on involvement open date") #+ #scale_x_continuous(breaks = seq(0:16), limits = c(0,16)) ``` The focus here is on exclusions, but for completion we'll also show the attendance plot. The children receiving this involvement show very poor attendance, which continues to worsen after the involvement: ```{r} #| label: plot non pip sip think for the future before after attendance #| fig-height: 3.5 inv_summary_long_start |> filter(cohort_name == "Non PIP/SIP - Think for the Future") |> ggplot(aes(x = term_index, y = mean.percent_present )) + geom_point(colour = "steel blue") + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Attendance before and after intervention - Non PIP/SIP - Think for the Future", subtitle = "Average percentage of available sessions attended +- 95 CI", x = "half term (relative to involvement start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") ``` This change in average attendance is driven by growth in those severely absent - 0-50% attendance; the orange bars here: ```{r} #| label: categorised stack plot non pip sip tftf stack_data <- involvements |> filter(cohort_name == "Non PIP/SIP - Think for the Future") |> select(stud_id,starts_with("attend_cat")) |> pivot_longer(-stud_id,names_to = "index", values_to = "attend_cat") |> mutate(index = str_replace(index,"attend_cat_","")) |> mutate(index = str_replace(index,"start_","")) |> mutate(index = str_replace(index,"_"," ")) |> mutate(index = factor(index, levels = c("prior 12","prior 11","prior 10","prior 9","prior 8","prior 7","prior 6","prior 5","prior 4","prior 3","prior 2","prior 1","post 1","post 2","post 3","post 4","post 5","post 6","post 7","post 8","post 9","post 10","post 11","post 12"))) |> group_by(index, attend_cat) |> filter(attend_cat != "out of scope") |> #filter(!attend_cat %in% c("out of scope","preschool","leaving age","covid")) |> tally() |> mutate(freq = n / sum(n)) |> filter(!is.na(index)) |> mutate(attend_cat = factor(attend_cat, levels = c("NA","0 - 50%","50 - 80%","80 - 90%","90 - 100%","preschool","leaving age","covid","out of scope"))) ggplot(stack_data, aes(x = index, y = freq, label = freq, fill = attend_cat, group = attend_cat)) + geom_col(position = position_stack(reverse = TRUE)) + #geom_text(position = position_stack(reverse = TRUE, vjust = 0.5), size = 2.5, angle = 90) + theme(axis.title.y = eb, legend.position = "right", axis.text.x = element_text(angle = 90, vjust = 0.5)) + scale_fill_manual(values = stack_colours, guide = guide_legend(reverse = TRUE)) + scale_y_continuous(labels = scales::percent) + labs(title = "Categorised attendance before & after Non PIP/SIP - Think for the Future", subtitle = "Percentage of pupils in each category", x = "half term index (relative to involvement start date)", caption = "data from Capita One", fill = "") + geom_vline(xintercept = 12.5, linetype = "dashed", colour = "gray40") + annotate(geom = "text", label = str_wrap("inv start", width = 10), y = 1.05, x = 13.5, colour = "gray40", size = 2.5) + coord_cartesian(clip = "off") ``` Looking at exclusions, you can see the extremely high exclusion rates immediately prior to involvement here, and although there is some reduction in the average rate, it's slight and over a long time. ```{r} #| label: plot non pip sip think for the future exclusions #| fig-height: 3.5 inv_summary_long_start |> filter(cohort_name == "Non PIP/SIP - Think for the Future") |> ggplot(aes(x = term_index, y = mean.percent_excluded) ) + geom_point(colour = "dark red") + geom_errorbar(aes(ymin = lower.ci.percent_excluded, ymax = upper.ci.percent_excluded), width = 0.2, colour = "dark red")+ labs(title = "Exclusion before and after intervention - Non PIP/SIP - Think for the Future", subtitle = "Average percentage of available sessions where the child was excluded +- 95 CI", x = "half term (relative to involvement start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") + geom_hline(yintercept = excl_2023_avg, linetype = "dashed", colour = "dark grey") + geom_text(aes(x = 4, label = paste("2023 avg",scales::percent(excl_2023_avg, accuracy = 0.1L)), y = excl_2023_avg), size = 2.5, vjust = 1, colour = "dark grey") ``` Finally for this involvement, the coded reasons show a significant year on year increase in overall absences, driven mostly by an increase in the 'no reason' category: ```{r} #| label: plot before & after by reason Non PIP/SIP - Think for the Future #| fig-height: 4 before_after_year_fun(cohort_name_selection = "Non PIP/SIP - Think for the Future") |> ggplot(aes(x = name, y = mean, fill = prior_post, label = scales::percent(mean, accuracy = 0.01))) + geom_col(position = "dodge") + geom_errorbar(aes(ymax = upper_ci,ymin = lower_ci), width = 0.2, alpha = 0.5, position = position_dodge(width = 0.9), colour = "gray20") + geom_text(position = position_dodge(width = 0.9), vjust = -0.2, size = 3) + scale_fill_met_d("Egypt") + scale_colour_met_d("Egypt") + scale_y_continuous(labels = scales::percent) + barplottheme_minimal + labs(title = "Absence reasons before and after Non PIP/SIP - Think for the Future", subtitle = "Average % of sessions missed one year either side of involvement or episode start date", caption = "data from Capita One") + theme(plot.title = element_markdown(size = 12), legend.position = "none", axis.title = eb, axis.text.y = eb) + scale_x_discrete(labels = function(x) str_wrap(x, width = 2)) ``` ## Non PIP/SIP - Theraputic Outreach :::{.callout-tip} The volumes and age profile, categorised attendance profile and before & after plots here all look very similar to the non PIP SIP think for the future involvements given above. Analysis shows that these are the exact same cohort of pupils! Is this an issue with the data, or to be expected? ::: ```{r} #| label: check cohort overlaps non pip sip to <- involvements |> filter(cohort_name == "Non PIP/SIP - Theraputic Outreach") |> select(stud_id) |> distinct() |> mutate(theraputic_outreach = 1) tftf <- involvements |> filter(cohort_name == "Non PIP/SIP - Think for the Future") |> select(stud_id) |> distinct() |> mutate(think_for_the_future = 1) overlap <- full_join(to, tftf, by = "stud_id") |> group_by(theraputic_outreach, think_for_the_future) |> tally() ``` There are around 100 involvements starting per year with the age profile peaking at 11: ```{r} #| label: non pip sip volumes & age on first start ##| out-width: 75% ##| fig-height: 3 #| fig-align: "center" #| fig-ncol: 2 # plot volumes involvements |> filter(cohort_name == "Non PIP/SIP - Theraputic Outreach", exam_year_inv_start >= 2019, exam_year_inv_start <= 2024) |> group_by(exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "Non PIP/SIP - Theraputic Outreach volumes", subtitle = "involvements starting by accademic year; 2025 is part year only", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2025)) + coord_cartesian(clip = "off") theraputic_outreach_inv_stud_id <- involvements |> filter(cohort_name == "Non PIP/SIP - Theraputic Outreach") |> select(stud_id) |> unique() theraputic_outreach_age <- involvements |> filter(cohort_name == "Non PIP/SIP - Theraputic Outreach") |> select(stud_id, cohort_name, open_date) |> left_join(attend |> filter(stud_id %in% theraputic_outreach_inv_stud_id$stud_id) |> select(stud_id, dob) |> unique(), by = "stud_id" ) |> mutate(age = floor(as.numeric( (open_date - dob)/365.25))) |> group_by(age) |> summarise(child_count = n_distinct(stud_id)) ggplot(theraputic_outreach_age, aes(x = age, y = child_count, label = child_count)) + geom_col(fill = "steelblue") + geom_text(colour = "steelblue", size = 3, vjust = -1)+ theme(axis.text.y = eb, axis.title.y = eb, axis.line.y = eb, axis.ticks.y = eb) + labs(title = "Age on involvement start - Non PIP/SIP - Theraputic Outreach", x = "age on involvement open date") #+ #scale_x_continuous(breaks = seq(0:16), limits = c(0,16)) ``` Attendance levels continue to decline after non PIP/SIP theraputic outreach: ```{r} #| label: plot non pip sip theraputic outreach before after attendance #| fig-height: 3.5 inv_summary_long_start |> filter(cohort_name == "Non PIP/SIP - Theraputic Outreach") |> ggplot(aes(x = term_index, y = mean.percent_present ) ) + geom_point(colour = "steel blue") + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Attendance before and after intervention - Non PIP/SIP - Theraputic Outreach", subtitle = "Average percentage of available sessions attended +- 95 CI", x = "half term (relative to involvement start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") ``` ```{r} #| label: categorised stack plot non pip sip theraputic outreach stack_data <- involvements |> filter(cohort_name == "Non PIP/SIP - Theraputic Outreach") |> select(stud_id,starts_with("attend_cat")) |> pivot_longer(-stud_id,names_to = "index", values_to = "attend_cat") |> mutate(index = str_replace(index,"attend_cat_","")) |> mutate(index = str_replace(index,"start_","")) |> mutate(index = str_replace(index,"_"," ")) |> mutate(index = factor(index, levels = c("prior 12","prior 11","prior 10","prior 9","prior 8","prior 7","prior 6","prior 5","prior 4","prior 3","prior 2","prior 1","post 1","post 2","post 3","post 4","post 5","post 6","post 7","post 8","post 9","post 10","post 11","post 12"))) |> group_by(index, attend_cat) |> filter(attend_cat != "out of scope") |> #filter(!attend_cat %in% c("out of scope","preschool","leaving age","covid")) |> tally() |> mutate(freq = n / sum(n)) |> filter(!is.na(index)) |> mutate(attend_cat = factor(attend_cat, levels = c("NA","0 - 50%","50 - 80%","80 - 90%","90 - 100%","preschool","leaving age","covid","out of scope"))) ggplot(stack_data, aes(x = index, y = freq, label = freq, fill = attend_cat, group = attend_cat)) + geom_col(position = position_stack(reverse = TRUE)) + #geom_text(position = position_stack(reverse = TRUE, vjust = 0.5), size = 2.5, angle = 90) + theme(axis.title.y = eb, legend.position = "right", axis.text.x = element_text(angle = 90, vjust = 0.5)) + scale_fill_manual(values = stack_colours, guide = guide_legend(reverse = TRUE)) + scale_y_continuous(labels = scales::percent) + labs(title = "Categorised attendance before & after Non PIP/SIP - Think for the Future", subtitle = "Percentage of pupils in each category", x = "half term index (relative to involvement start date)", caption = "data from Capita One", fill = "") + geom_vline(xintercept = 12.5, linetype = "dashed", colour = "gray40") + annotate(geom = "text", label = str_wrap("inv start", width = 10), y = 1.05, x = 13.5, colour = "gray40", size = 2.5) + coord_cartesian(clip = "off") ``` There is also no improvement in rates of exclusion absences for children with this involvement, with rates remaining well above average following involvement start: ```{r} #| label: plot non pip sip theraputic outreach exclusions #| fig-height: 3.5 inv_summary_long_start |> filter(cohort_name == "Non PIP/SIP - Theraputic Outreach") |> ggplot(aes(x = term_index, y = mean.percent_excluded) ) + geom_point(colour = "dark red") + geom_errorbar(aes(ymin = lower.ci.percent_excluded, ymax = upper.ci.percent_excluded), width = 0.2, colour = "dark red")+ labs(title = "Exclusion before and after intervention - Non PIP/SIP - Theraputic Outreach", subtitle = "Average percentage of available sessions where the child was excluded +- 95 CI", x = "half term (relative to involvement start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") + geom_hline(yintercept = excl_2023_avg, linetype = "dashed", colour = "dark grey") + geom_text(aes(x = 4, label = paste("2023 avg",scales::percent(excl_2023_avg, accuracy = 0.1L)), y = excl_2023_avg), size = 2.5, vjust = 1, colour = "dark grey") ``` ## SIP - Theraputic Outreach ```{r} #| label: sip theraputic outreach volumes & age on first start ##| out-width: 75% ##| fig-height: 3 #| fig-align: "center" #| fig-ncol: 2 # plot volumes involvements |> filter(cohort_name == "SIP - Theraputic Outreach", exam_year_inv_start >= 2019, exam_year_inv_start <= 2024) |> group_by(exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "SIP - Theraputic Outreach volumes", subtitle = "involvements starting by accademic year; 2025 is part year only", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2025)) + coord_cartesian(clip = "off") theraputic_outreach_inv_stud_id <- involvements |> filter(cohort_name == "SIP - Theraputic Outreach") |> select(stud_id) |> unique() theraputic_outreach_age <- involvements |> filter(cohort_name == "SIP - Theraputic Outreach") |> select(stud_id, cohort_name, open_date) |> left_join(attend |> filter(stud_id %in% theraputic_outreach_inv_stud_id$stud_id) |> select(stud_id, dob) |> unique(), by = "stud_id" ) |> mutate(age = floor(as.numeric( (open_date - dob)/365.25))) |> group_by(age) |> summarise(child_count = n_distinct(stud_id)) ggplot(theraputic_outreach_age, aes(x = age, y = child_count, label = child_count)) + geom_col(fill = "steelblue") + geom_text(colour = "steelblue", size = 3, vjust = -1)+ theme(axis.text.y = eb, axis.title.y = eb, axis.line.y = eb, axis.ticks.y = eb) + labs(title = "Age on involvement start - SIP - Theraputic Outreach", x = "age on involvement open date") #+ #scale_x_continuous(breaks = seq(0:16), limits = c(0,16)) ``` SIP theraputic outreach involvements are associated with a continued decline in attendance: ```{r} #| label: plot sip theraputic outreach before after attendance #| fig-height: 3.5 inv_summary_long_start |> filter(cohort_name == "SIP - Theraputic Outreach") |> ggplot(aes(x = term_index, y = mean.percent_present ) ) + geom_point(colour = "steel blue") + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Attendance before and after intervention - SIP - Theraputic Outreach", subtitle = "Average percentage of available sessions attended +- 95 CI", x = "half term (relative to involvement start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") ``` SIP theraputic outreach involvements are associated with a significant improvement in exclusion rates: ```{r} #| label: plot sip theraputic outreach exclusions #| fig-height: 3.5 inv_summary_long_start |> filter(cohort_name == "SIP - Theraputic Outreach") |> ggplot(aes(x = term_index, y = mean.percent_excluded) ) + geom_point(colour = "dark red") + geom_errorbar(aes(ymin = lower.ci.percent_excluded, ymax = upper.ci.percent_excluded), width = 0.2, colour = "dark red")+ labs(title = "Exclusion before and after intervention - SIP - Theraputic Outreach", subtitle = "Average percentage of available sessions where the child was excluded +- 95 CI", x = "half term (relative to involvement start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") + geom_hline(yintercept = excl_2023_avg, linetype = "dashed", colour = "dark grey") + geom_text(aes(x = 4, label = paste("2023 avg",scales::percent(excl_2023_avg, accuracy = 0.1L)), y = excl_2023_avg), size = 2.5, vjust = 1, colour = "dark grey") ``` ## PIP - Theraputic Outreach ```{r} #| label: PIP theraputic outreach volumes & age on first start ##| out-width: 75% ##| fig-height: 3 #| fig-align: "center" #| fig-ncol: 2 # plot volumes involvements |> filter(cohort_name == "PIP - Theraputic Outreach", exam_year_inv_start >= 2019, exam_year_inv_start <= 2024) |> group_by(exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "I&A - Y9 volumes", subtitle = "involvements starting by accademic year; 2025 is part year only", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2025)) + coord_cartesian(clip = "off") theraputic_outreach_inv_stud_id <- involvements |> filter(cohort_name == "PIP - Theraputic Outreach") |> select(stud_id) |> unique() theraputic_outreach_age <- involvements |> filter(cohort_name == "PIP - Theraputic Outreach") |> select(stud_id, cohort_name, open_date) |> left_join(attend |> filter(stud_id %in% theraputic_outreach_inv_stud_id$stud_id) |> select(stud_id, dob) |> unique(), by = "stud_id" ) |> mutate(age = floor(as.numeric( (open_date - dob)/365.25))) |> group_by(age) |> summarise(child_count = n_distinct(stud_id)) ggplot(theraputic_outreach_age, aes(x = age, y = child_count, label = child_count)) + geom_col(fill = "steelblue") + geom_text(colour = "steelblue", size = 3, vjust = -1)+ theme(axis.text.y = eb, axis.title.y = eb, axis.line.y = eb, axis.ticks.y = eb) + labs(title = "Age on involvement start - PIP - Theraputic Outreach", x = "age on involvement open date") #+ #scale_x_continuous(breaks = seq(0:16), limits = c(0,16)) ``` After PIP theraputic outreach involvements attendance is very poor - although the picture here is better than at Secondary Inclusion Panel theraputic outreach, and shows a slow improvement: ```{r} #| label: plot PIP theraputic outreach before after attendance #| fig-height: 3.5 inv_summary_long_start |> filter(cohort_name == "PIP - Theraputic Outreach") |> ggplot(aes(x = term_index, y = mean.percent_present ) ) + geom_point(colour = "steel blue") + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Attendance before and after intervention - PIP - Theraputic Outreach", subtitle = "Average percentage of available sessions attended +- 95 CI", x = "half term (relative to involvement start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") ``` Though rates do not return to previous levels or the overall average, PIP theraputic outreach involvements are associated with a significant improvement in exclusion rates: ```{r} #| label: plot PIP theraputic outreach exclusions #| fig-height: 3.5 inv_summary_long_start |> filter(cohort_name == "PIP - Theraputic Outreach") |> ggplot(aes(x = term_index, y = mean.percent_excluded) ) + geom_point(colour = "dark red") + geom_errorbar(aes(ymin = lower.ci.percent_excluded, ymax = upper.ci.percent_excluded), width = 0.2, colour = "dark red")+ labs(title = "Exclusion before and after intervention - PIP - Theraputic Outreach", subtitle = "Average percentage of available sessions where the child was excluded +- 95 CI", x = "half term (relative to involvement start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") + geom_hline(yintercept = excl_2023_avg, linetype = "dashed", colour = "dark grey") + geom_text(aes(x = 4, label = paste("2023 avg",scales::percent(excl_2023_avg, accuracy = 0.1L)), y = excl_2023_avg), size = 2.5, vjust = 1, colour = "dark grey") ``` ## Secondary Inclusion Panel :::{.callout-caution} As with Non PIP/SIP Think for the Future and Non PIP/SIP Theraputic Outreach, SIP theraputic outreach and Secondary Inclusion Panel seem to be the exact same cohort of pupils ::: TO DO - check this in the source data ```{r} #| label: check cohort overlaps sip sip_to <- involvements |> filter(cohort_name == "SIP - Theraputic Outreach") |> select(stud_id) |> distinct() |> mutate(SIP_theraputic_outreach = 1) sip <- involvements |> filter(cohort_name == "Secondary Inclusion Panel") |> select(stud_id) |> distinct() |> mutate(SIP = 1) overlap <- full_join(sip, sip_to, by = "stud_id") |> group_by(SIP_theraputic_outreach, SIP) |> tally() ``` ```{r} #| label: sip volumes & age on first start ##| out-width: 75% ##| fig-height: 3 #| fig-align: "center" #| fig-ncol: 2 # plot volumes involvements |> filter(cohort_name == "Secondary Inclusion Panel", exam_year_inv_start >= 2019, exam_year_inv_start <= 2024) |> group_by(exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "Secondary Inclusion Panel volumes", subtitle = "involvements starting by accademic year; 2025 is part year only", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2025)) + coord_cartesian(clip = "off") sip_inv_stud_id <- involvements |> filter(cohort_name == "Secondary Inclusion Panel") |> select(stud_id) |> unique() sip_age <- involvements |> filter(cohort_name == "SIP - Theraputic Outreach") |> select(stud_id, cohort_name, open_date) |> left_join(attend |> filter(stud_id %in% sip_inv_stud_id$stud_id) |> select(stud_id, dob) |> unique(), by = "stud_id" ) |> mutate(age = floor(as.numeric( (open_date - dob)/365.25))) |> group_by(age) |> summarise(child_count = n_distinct(stud_id)) ggplot(sip_age, aes(x = age, y = child_count, label = child_count)) + geom_col(fill = "steelblue") + geom_text(colour = "steelblue", size = 3, vjust = -1)+ theme(axis.text.y = eb, axis.title.y = eb, axis.line.y = eb, axis.ticks.y = eb) + labs(title = "Age on involvement start - Secondary Inclusion Panel", x = "age on involvement open date") #+ #scale_x_continuous(breaks = seq(0:16), limits = c(0,16)) ``` Secondary Inclusion Panel involvements are associated with a continued decline in attendance: ```{r} #| label: plot sip before after attendance #| fig-height: 3.5 inv_summary_long_start |> filter(cohort_name == "Secondary Inclusion Panel") |> ggplot(aes(x = term_index, y = mean.percent_present ) ) + geom_point(colour = "steel blue") + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Attendance before and after intervention - Secondary Inclusion Panel", subtitle = "Average percentage of available sessions attended +- 95 CI", x = "half term (relative to involvement start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") ``` Secondary Inclusion Panel involvements are associated with a significant improvement in exclusion rates: ```{r} #| label: plot sip exclusions #| fig-height: 3.5 inv_summary_long_start |> filter(cohort_name == "Secondary Inclusion Panel") |> ggplot(aes(x = term_index, y = mean.percent_excluded) ) + geom_point(colour = "dark red") + geom_errorbar(aes(ymin = lower.ci.percent_excluded, ymax = upper.ci.percent_excluded), width = 0.2, colour = "dark red")+ labs(title = "Exclusion before and after intervention - Secondary Inclusion Panel", subtitle = "Average percentage of available sessions where the child was excluded +- 95 CI", x = "half term (relative to involvement start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") + geom_hline(yintercept = excl_2023_avg, linetype = "dashed", colour = "dark grey") + geom_text(aes(x = 4, label = paste("2023 Secondary avg",scales::percent(excl_2023_sec_avg, accuracy = 0.1L)), y = excl_2023_avg), size = 2.5, vjust = 1, colour = "dark grey") ``` ## Primary Inclusion Panel :::{.callout-caution} As with several other pairs of involvements, PIP and PIP theraputic outreach are the exact same cohort ::: ```{r} #| label: pip volumes & age on first start ##| out-width: 75% ##| fig-height: 3 #| fig-align: "center" #| fig-ncol: 2 # plot volumes involvements |> filter(cohort_name == "Primary Inclusion Panel", exam_year_inv_start >= 2019, exam_year_inv_start <= 2024) |> group_by(exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "Primary Inclusion Panel volumes", subtitle = "involvements starting by accademic year; 2025 is part year only", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2025)) + coord_cartesian(clip = "off") pip_inv_stud_id <- involvements |> filter(cohort_name == "Primary Inclusion Panel") |> select(stud_id) |> unique() pip_age <- involvements |> filter(cohort_name == "Primary Inclusion Panel") |> select(stud_id, cohort_name, open_date) |> left_join(attend |> filter(stud_id %in% pip_inv_stud_id$stud_id) |> select(stud_id, dob) |> unique(), by = "stud_id" ) |> mutate(age = floor(as.numeric( (open_date - dob)/365.25))) |> group_by(age) |> summarise(child_count = n_distinct(stud_id)) ggplot(pip_age, aes(x = age, y = child_count, label = child_count)) + geom_col(fill = "steelblue") + geom_text(colour = "steelblue", size = 3, vjust = -1)+ theme(axis.text.y = eb, axis.title.y = eb, axis.line.y = eb, axis.ticks.y = eb) + labs(title = "Age on involvement start - Primary Inclusion Panel", x = "age on involvement open date") #+ #scale_x_continuous(breaks = seq(0:16), limits = c(0,16)) ``` Primary Inclusion Panel involvements are associated with a continued decline in attendance: ```{r} #| label: plot pip before after attendance #| fig-height: 3.5 inv_summary_long_start |> filter(cohort_name == "Primary Inclusion Panel") |> ggplot(aes(x = term_index, y = mean.percent_present ) ) + geom_point(colour = "steel blue") + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Attendance before and after intervention - Primary Inclusion Panel", subtitle = "Average percentage of available sessions attended +- 95 CI", x = "half term (relative to involvement start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") ``` Primary Inclusion Panel involvements are associated with a significant improvement in exclusion rates: ```{r} #| label: plot pip exclusions #| fig-height: 3.5 inv_summary_long_start |> filter(cohort_name == "Primary Inclusion Panel") |> ggplot(aes(x = term_index, y = mean.percent_excluded) ) + geom_point(colour = "dark red") + geom_errorbar(aes(ymin = lower.ci.percent_excluded, ymax = upper.ci.percent_excluded), width = 0.2, colour = "dark red")+ labs(title = "Exclusion before and after intervention - Primary Inclusion Panel", subtitle = "Average percentage of available sessions where the child was excluded +- 95 CI", x = "half term (relative to involvement start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") + geom_hline(yintercept = excl_2023_pri_avg, linetype = "dashed", colour = "dark grey") + geom_text(aes(x = 4, label = paste("2023 Primary avg",scales::percent(excl_2023_avg, accuracy = 0.1L)), y = excl_2023_avg), size = 2.5, vjust = 1, colour = "dark grey") ``` ## PNOR 50+ Volumes of this involvement are fairly low, and many do not appear to have a Date of Birth recorded, so the age profile could not be calculated. ```{r} #| label: pnor50+ volumes & age on first start ##| out-width: 75% ##| fig-height: 3 #| fig-align: "center" #| fig-ncol: 2 # plot volumes involvements |> filter(cohort_name == "PNOR 50+", exam_year_inv_start >= 2019, exam_year_inv_start <= 2024) |> group_by(exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "PNOR 50+ volumes", subtitle = "involvements starting by academic year", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2025)) + coord_cartesian(clip = "off") pnor_50_stud_id <- involvements |> filter(cohort_name == "PNOR 50+") |> select(stud_id) |> unique() pnor_50_age <- involvements |> filter(cohort_name == "PNOR 50+") |> select(stud_id, cohort_name, open_date) |> left_join(attend |> filter(stud_id %in% pnor_50_stud_id$stud_id) |> select(stud_id, dob) |> unique(), by = "stud_id" ) |> mutate(age = floor(as.numeric( (open_date - dob)/365.25))) |> group_by(age) |> summarise(child_count = n_distinct(stud_id)) ggplot(pnor_50_age, aes(x = age, y = child_count, label = child_count)) + geom_col(fill = "steelblue") + geom_text(colour = "steelblue", size = 3, vjust = -1)+ theme(axis.text.y = eb, axis.title.y = eb, axis.line.y = eb, axis.ticks.y = eb) + labs(title = "Age on involvement start - PNOR 50+", x = "age on involvement open date") #+ #scale_x_continuous(breaks = seq(0:16), limits = c(0,16)) ``` Volumes of this involvement prior to 2024 are very low, and the involvement generally pertains to older children, and specifically to those not on roll. There is no attendance data after involvement start date for this cohort: ```{r} #| label: plot pnor 50+ before after attendance #| fig-height: 3.5 inv_summary_long_start |> filter(cohort_name == "PNOR 50+") |> ggplot(aes(x = term_index, y = mean.percent_present ) ) + geom_point(colour = "steel blue") + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Attendance before and after intervention - PNOR 50+", subtitle = "Average percentage of available sessions attended +- 95 CI", x = "half term (relative to involvement start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") ``` The stacked plot is useful to understand how recent these involvements are, how much of the attendnance data is either not present (or NA - the red bars here) or out of scope (black bars) meaning this relates to periods of time that have not yet occurred, or for which the data has not been released: ```{r} #| label: categorised stack plot pnor 50+ stack_data <- involvements |> filter(cohort_name == "PNOR 50+") |> select(stud_id,starts_with("attend_cat")) |> pivot_longer(-stud_id,names_to = "index", values_to = "attend_cat") |> mutate(index = str_replace(index,"attend_cat_","")) |> mutate(index = str_replace(index,"start_","")) |> mutate(index = str_replace(index,"_"," ")) |> mutate(index = factor(index, levels = c("prior 12","prior 11","prior 10","prior 9","prior 8","prior 7","prior 6","prior 5","prior 4","prior 3","prior 2","prior 1","post 1","post 2","post 3","post 4","post 5","post 6","post 7","post 8","post 9","post 10","post 11","post 12"))) |> group_by(index, attend_cat) |> #filter(attend_cat != "out of scope") |> #filter(!attend_cat %in% c("out of scope","preschool","leaving age","covid")) |> tally() |> mutate(freq = n / sum(n)) |> filter(!is.na(index)) |> mutate(attend_cat = factor(attend_cat, levels = c("NA","0 - 50%","50 - 80%","80 - 90%","90 - 100%","preschool","leaving age","covid","out of scope"))) ggplot(stack_data, aes(x = index, y = freq, label = freq, fill = attend_cat, group = attend_cat)) + geom_col(position = position_stack(reverse = TRUE)) + #geom_text(position = position_stack(reverse = TRUE, vjust = 0.5), size = 2.5, angle = 90) + theme(axis.title.y = eb, legend.position = "right", axis.text.x = element_text(angle = 90, vjust = 0.5)) + scale_fill_manual(values = stack_colours, guide = guide_legend(reverse = TRUE)) + scale_y_continuous(labels = scales::percent) + labs(title = "Categorised attendance before & after - PNOR 50+", subtitle = "Percentage of pupils in each category", x = "half term index (relative to involvement start date)", caption = "data from Capita One", fill = "") + geom_vline(xintercept = 12.5, linetype = "dashed", colour = "gray40") + annotate(geom = "text", label = str_wrap("inv start", width = 10), y = 1.05, x = 13.5, colour = "gray40", size = 2.5) + coord_cartesian(clip = "off") ``` ## I&A - Vulnerable Learner ```{r} #| label: I&A - Vulnerable Learner volumes & age on first start ##| out-width: 75% ##| fig-height: 3 #| fig-align: "center" #| fig-ncol: 2 # plot volumes involvements |> filter(cohort_name == "I&A - Vulnerable Learner", exam_year_inv_start >= 2019, exam_year_inv_start <= 2024) |> group_by(exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "I&A - Vulnerable Learner volumes", subtitle = "involvements starting by accademic year; 2025 is part year only", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2025)) + coord_cartesian(clip = "off") IA_VL_stud_id <- involvements |> filter(cohort_name == "I&A - Vulnerable Learner") |> select(stud_id) |> unique() IA_VL_age <- involvements |> filter(cohort_name == "I&A - Vulnerable Learner") |> select(stud_id, cohort_name, open_date) |> left_join(attend |> filter(stud_id %in% IA_VL_stud_id$stud_id) |> select(stud_id, dob) |> unique(), by = "stud_id" ) |> mutate(age = floor(as.numeric( (open_date - dob)/365.25))) |> group_by(age) |> summarise(child_count = n_distinct(stud_id)) ggplot(IA_VL_age, aes(x = age, y = child_count, label = child_count)) + geom_col(fill = "steelblue") + geom_text(colour = "steelblue", size = 3, vjust = -1)+ theme(axis.text.y = eb, axis.title.y = eb, axis.line.y = eb, axis.ticks.y = eb) + labs(title = "Age on involvement start - I&A - Vulnerable Learner", x = "age on involvement open date") #+ #scale_x_continuous(breaks = seq(0:16), limits = c(0,16)) ``` I&A - Vulnerable Learner involvements are associated with very low attendance levels, and show no apparent improvement once the involvement is in place: ```{r} #| label: plot I&A - Vulnerable Learner before after attendance #| fig-height: 3.5 inv_summary_long_start |> filter(cohort_name == "I&A - Vulnerable Learner") |> ggplot(aes(x = term_index, y = mean.percent_present ) ) + geom_point(colour = "steel blue") + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Attendance before and after intervention - I&A - Vulnerable Learner", subtitle = "Average percentage of available sessions attended +- 95 CI", x = "half term (relative to involvement start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") ``` The categorised stacked plot reveals more of what's happening with this cohort - some are of leaving age, but more become just missing from the data. There are also large volumes who become severely absent (0-50% - the orange bars here) after the involvement date. ```{r} #| label: categorised stack plot IA vulnerable learner stack_data <- involvements |> filter(cohort_name == "I&A - Vulnerable Learner") |> select(stud_id,starts_with("attend_cat")) |> pivot_longer(-stud_id,names_to = "index", values_to = "attend_cat") |> mutate(index = str_replace(index,"attend_cat_","")) |> mutate(index = str_replace(index,"start_","")) |> mutate(index = str_replace(index,"_"," ")) |> mutate(index = factor(index, levels = c("prior 12","prior 11","prior 10","prior 9","prior 8","prior 7","prior 6","prior 5","prior 4","prior 3","prior 2","prior 1","post 1","post 2","post 3","post 4","post 5","post 6","post 7","post 8","post 9","post 10","post 11","post 12"))) |> group_by(index, attend_cat) |> filter(attend_cat != "out of scope") |> #filter(!attend_cat %in% c("out of scope","preschool","leaving age","covid")) |> tally() |> mutate(freq = n / sum(n)) |> filter(!is.na(index)) |> mutate(attend_cat = factor(attend_cat, levels = c("NA","0 - 50%","50 - 80%","80 - 90%","90 - 100%","preschool","leaving age","covid","out of scope"))) ggplot(stack_data, aes(x = index, y = freq, label = freq, fill = attend_cat, group = attend_cat)) + geom_col(position = position_stack(reverse = TRUE)) + #geom_text(position = position_stack(reverse = TRUE, vjust = 0.5), size = 2.5, angle = 90) + theme(axis.title.y = eb, legend.position = "right", axis.text.x = element_text(angle = 90, vjust = 0.5)) + scale_fill_manual(values = stack_colours, guide = guide_legend(reverse = TRUE)) + scale_y_continuous(labels = scales::percent) + labs(title = "Categorised attendance before & after I&A - Vulnerable Learner", subtitle = "Percentage of pupils in each category", x = "half term index (relative to involvement start date)", caption = "data from Capita One", fill = "") + geom_vline(xintercept = 12.5, linetype = "dashed", colour = "gray40") + annotate(geom = "text", label = str_wrap("inv start", width = 10), y = 1.05, x = 13.5, colour = "gray40", size = 2.5) + coord_cartesian(clip = "off") ``` ## Portage Portage service is associated with very young children with special educational needs. ```{r} #| label: Portage volumes & age on first start ##| out-width: 75% ##| fig-height: 3 #| fig-align: "center" #| fig-ncol: 2 # plot volumes involvements |> filter(cohort_name == "Portage", exam_year_inv_start >= 2019, exam_year_inv_start <= 2024) |> group_by(exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "Portage volumes", subtitle = "involvements starting by accademic year; 2025 is part year only", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2025)) + coord_cartesian(clip = "off") portage_stud_id <- involvements |> filter(cohort_name == "Portage") |> select(stud_id) |> unique() portage_age <- involvements |> filter(cohort_name == "Portage") |> select(stud_id, cohort_name, open_date) |> left_join(attend |> filter(stud_id %in% portage_stud_id$stud_id) |> select(stud_id, dob) |> unique(), by = "stud_id" ) |> mutate(age = floor(as.numeric( (open_date - dob)/365.25))) |> group_by(age) |> summarise(child_count = n_distinct(stud_id)) ggplot(portage_age, aes(x = age, y = child_count, label = child_count)) + geom_col(fill = "steelblue") + geom_text(colour = "steelblue", size = 3, vjust = -1)+ theme(axis.text.y = eb, axis.title.y = eb, axis.line.y = eb, axis.ticks.y = eb) + labs(title = "Age on involvement start - Portage", x = "age on involvement open date") #+ #scale_x_continuous(breaks = seq(0:16), limits = c(0,16)) ``` The team work more with largely children with EHC plans, and across a range of needs, principally those who require SEN support, principally speech, language & communication needs and autism. ```{r} #| label: table of Portage count by characteristics #| fig-align: centre #| layout: [45, -10, 45] involvements |> filter( cohort_name == "Portage", !is.na(sen_level)) |> group_by( sen_level ) |> summarise(child_count = n_distinct(stud_id)) |> arrange(desc(child_count)) |> gt(rowname_col = "sen_level") |> tab_header( title = "Portage", subtitle = "counts of children with involvements in 2024, by SEN level" ) |> tab_options( table.align = "left", heading.title.font.size = 12, heading.subtitle.font.size= 10, heading.align = "left", column_labels.font.size = 14, stub.font.size = 12, column_labels.hidden = TRUE ) |> cols_align("left",'sen_level') involvements |> filter(cohort_name == "Portage", !is.na(imd_quartile)) |> group_by(imd_quartile) |> summarise(child_count = n_distinct(stud_id)) |> gt(rowname_col = "sen_level") |> tab_header( title = "Portage - counts by IMD quartile", subtitle = "4 = most deprived wards; 1 = least deprived" ) |> tab_options( table.align = "left", heading.title.font.size = 12, heading.subtitle.font.size= 10, heading.align = "left", column_labels.font.size = 14, stub.font.size = 12, column_labels.hidden = TRUE ) |> cols_align("left",'imd_quartile') ``` ```{r} involvements |> filter( cohort_name == "Portage", #!is.na(sen_level) ) |> group_by( primary_specific_need ) |> summarise(child_count = n_distinct(stud_id)) |> arrange(desc(child_count)) |> gt(rowname_col = "primary_specific_need") |> tab_header( title = "Portage", subtitle = "counts of children with involvements in 2024, by primary specific_need" ) |> tab_options( table.align = "left", heading.title.font.size = 12, heading.subtitle.font.size= 10, heading.align = "left", column_labels.font.size = 14, stub.font.size = 12, column_labels.hidden = TRUE ) |> cols_align("left",'primary_specific_need') ``` For portage we've taken attendance data from year 0 to year 6 for children with EHC plan; SEN support; Speech, Language and Communication Needs, and Autism, in each case comparing children with & without a Portage involvement. The differences here are small but children with SEN support have better attendance in all primary years except year 1 when involved with the service. There is also evidence that children with Autism show better attendance. For children with SLCN (which is the most prevalent primary need of children involved with the service) or an EHC plan, the picture is quite mixed, with no clear evidence of a significantly better attendance. (Year 0 is removed here due to low availability of data) ```{r} #| label: Portage effectiveness by characteristic group #| fig-height: 8 portage_stud_id <- involvements |> filter(cohort_name == "Portage") |> select(stud_id) |> distinct() |> mutate(portage = TRUE) attend_pri <- attend |> filter( ncy %in% c(0,1,2,3,4,5,6) & year >= 2021 ) port_sen_support <- attend_pri |> left_join(portage_stud_id, by = "stud_id") |> filter(sen_level == "SEN Support" ) |> mutate(portage = if_else(is.na(portage),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","portage")) |> group_by(ncy,portage) |> summarise_avg() |> mutate(category = "SEN Support") port_ehcp <- attend_pri |> left_join(portage_stud_id, by = "stud_id") |> filter(sen_level == "EHCP" ) |> mutate(portage = if_else(is.na(portage),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","portage")) |> group_by(ncy,portage) |> summarise_avg() |> mutate(category = "EHCP") port_slcn <- attend_pri |> left_join(portage_stud_id, by = "stud_id") |> filter(primary_specific_need == "Speech, Language And Communication Needs" ) |> mutate(portage = if_else(is.na(portage),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","portage")) |> group_by(ncy,portage) |> summarise_avg() |> mutate(category = "Speech, Language And Communication Needs") port_asd <- attend_pri |> left_join(portage_stud_id, by = "stud_id") |> filter(primary_specific_need == "Autism" ) |> mutate(portage = if_else(is.na(portage),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","portage")) |> group_by(ncy,portage) |> summarise_avg() |> mutate(category = "Autism") rbind( #port_overall, port_sen_support, #port_deprived, port_ehcp, port_slcn, port_asd ) |> mutate(category = factor(category, levels = c("SEN Support", "EHCP", "Speech, Language And Communication Needs", "Autism") )) |> filter(ncy > 0) |> # huge error bars on the y0 attendance ggplot(aes(x = ncy, y = mean.percent_present, colour = portage, group = portage)) + geom_point() + geom_line(alpha = 0.7 ) + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2)+ geom_text(aes(label = scales::percent(round(mean.percent_present,3))), vjust = 2, colour = "white", size = 4, position = position_dodge(0.9)) + geom_vline(aes(xintercept = 6.5), linetype = "dotted", colour = "gray") + facet_wrap(vars(category)) + labs(title = "Attendance through years 0 to 6 - pupils without and with Portage involvement", subtitle = "Average percentage of available sessions attended since 2021; +- 95 CI", caption = "data from Capita One") + scale_y_continuous(labels = scales::percent, limits = c(0.8, 1.0)) + scale_x_continuous(labels = seq(0,6), breaks = seq(0,6)) + barplottheme_minimal + theme(axis.title.x = eb, #axis.text.y = eb, legend.position = "none", strip.background = eb, strip.text = element_text(size = 8), plot.title = element_markdown(size = 12)) + MetBrewer::scale_fill_met_d("Egypt") ``` ## Rowan outreach ```{r} #| label: Rowan outreach volumes & age on first start ##| out-width: 75% ##| fig-height: 3 #| fig-align: "center" #| fig-ncol: 2 # plot volumes involvements |> filter(cohort_name == "Rowan outreach", exam_year_inv_start >= 2019, exam_year_inv_start <= 2024) |> group_by(exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "Rowan outreach activity", subtitle = "involvements starting by accademic year; 2025 is part year only", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2025)) + coord_cartesian(clip = "off") ra_stud_id <- involvements |> filter(cohort_name == "Rowan outreach") |> select(stud_id) |> unique() ra_age <- involvements |> filter(cohort_name == "Rowan outreach") |> select(stud_id, cohort_name, open_date) |> left_join(attend |> filter(stud_id %in% ra_stud_id$stud_id) |> select(stud_id, dob) |> unique(), by = "stud_id" ) |> mutate(age = floor(as.numeric( (open_date - dob)/365.25))) |> group_by(age) |> summarise(child_count = n_distinct(stud_id)) ggplot(ra_age, aes(x = age, y = child_count, label = child_count)) + geom_col(fill = "steelblue") + geom_text(colour = "steelblue", size = 3, vjust = -1)+ theme(axis.text.y = eb, axis.title.y = eb, axis.line.y = eb, axis.ticks.y = eb) + labs(title = "Age on involvement start - Rowan outreach", x = "age on involvement open date") #+ #scale_x_continuous(breaks = seq(0:16), limits = c(0,16)) ``` The majority of children with the Rowan outreach involvement have primary needs of Autism or Speech, Language & Communication needs: ```{r} #| label: table of Rowan outreach count by primary need involvements |> filter( cohort_name == "Rowan outreach", #!is.na(sen_level) ) |> group_by( primary_specific_need ) |> summarise(child_count = n_distinct(stud_id)) |> arrange(desc(child_count)) |> gt(rowname_col = "primary_specific_need") |> tab_header( title = "Rowan outreach", subtitle = "counts of children with involvements in 2024, by primary specific_need" ) |> tab_options( table.align = "left", heading.title.font.size = 12, heading.subtitle.font.size= 10, heading.align = "left", column_labels.font.size = 14, stub.font.size = 12, column_labels.hidden = TRUE ) |> cols_align("left",'primary_specific_need') ``` The 'before & after' plot for Rowan outreach shows a change in average direction, but a slightly messy picture, since there isn't a lot of full "before & after" data available: ```{r} #| label: plot Rowan outreach before after attendance #| fig-height: 3.5 inv_summary_long_start |> filter(cohort_name == "Rowan outreach") |> ggplot(aes(x = term_index, y = mean.percent_present ) ) + geom_point(colour = "steel blue") + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Attendance before and after intervention - Rowan outreach", subtitle = "Average percentage of available sessions attended +- 95 CI", x = "half term (relative to involvement start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") ``` Given that Rowan outreach work with children at or before school readiness age, it makes sense to look at the same primary school attendance profile that we did for Portage above. Here year 0 is removed due to very small data availability. Although generally children involved with the outreach team show lower attendance than those without, children with Speech, Language & Communication Needs, children with Autism, and children with an EHC plan all show significant and consistent _improvement_ in attendance through years 1 to 3 when involved with the team. ```{r} #| label: Rowan effectiveness by characteristic group #| fig-height: 8 Rowan_stud_id <- involvements |> filter(cohort_name == "Rowan outreach") |> select(stud_id) |> distinct() |> mutate(Rowan = TRUE) attend_pri <- attend |> filter( ncy %in% c(1,2,3,4,5,6) & year >= 2021 ) rowan_sen_support <- attend_pri |> left_join(Rowan_stud_id, by = "stud_id") |> filter(sen_level == "SEN Support" ) |> mutate(Rowan = if_else(is.na(Rowan),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","Rowan")) |> group_by(ncy,Rowan) |> summarise_avg() |> mutate(category = "SEN Support") rowan_ehcp <- attend_pri |> left_join(Rowan_stud_id, by = "stud_id") |> filter(sen_level == "EHCP" ) |> mutate(Rowan = if_else(is.na(Rowan),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","Rowan")) |> group_by(ncy,Rowan) |> summarise_avg() |> mutate(category = "EHCP") rowan_slcn <- attend_pri |> left_join(Rowan_stud_id, by = "stud_id") |> filter(primary_specific_need == "Speech, Language And Communication Needs" ) |> mutate(Rowan = if_else(is.na(Rowan),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","Rowan")) |> group_by(ncy,Rowan) |> summarise_avg() |> mutate(category = "Speech, Language And Communication Needs") rowan_asd <- attend_pri |> left_join(Rowan_stud_id, by = "stud_id") |> filter(primary_specific_need == "Autism" ) |> mutate(Rowan = if_else(is.na(Rowan),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","Rowan")) |> group_by(ncy,Rowan) |> summarise_avg() |> mutate(category = "Autism") rbind( #port_overall, rowan_sen_support, #port_deprived, rowan_ehcp, rowan_slcn, rowan_asd ) |> mutate(category = factor(category, levels = c("SEN Support", "EHCP", "Speech, Language And Communication Needs", "Autism") )) |> ggplot(aes(x = ncy, y = mean.percent_present, colour = Rowan, group = Rowan)) + geom_point() + geom_line(alpha = 0.7 ) + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2)+ geom_text(aes(label = scales::percent(round(mean.percent_present,3))), vjust = 2, colour = "white", size = 4, position = position_dodge(0.9)) + geom_vline(aes(xintercept = 6.5), linetype = "dotted", colour = "gray") + facet_wrap(vars(category)) + labs(title = "Attendance through years 0 to 6 - pupils without and with Rowan outreach involvement", subtitle = "Average percentage of available sessions attended since 2021; +- 95 CI", caption = "data from Capita One") + scale_y_continuous(labels = scales::percent) + barplottheme_minimal + theme(axis.title.x = eb, #axis.text.y = eb, legend.position = "none", strip.background = eb, strip.text = element_text(size = 8), plot.title = element_markdown(size = 12)) + MetBrewer::scale_fill_met_d("Egypt") ``` ## Visual Impairment team ```{r} #| label: visual impairment team volumes & age on first start ##| out-width: 75% ##| fig-height: 3 #| fig-align: "center" #| fig-ncol: 2 # plot volumes involvements |> filter(cohort_name == "VI team", exam_year_inv_start >= 2019, exam_year_inv_start <= 2024) |> group_by(exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "Visual Impairment Team - volumes", subtitle = "involvements starting by accademic year; 2025 is part year only", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2025)) + coord_cartesian(clip = "off") vi_tm_inv_stud_id <- involvements |> filter(cohort_name == "VI team") |> select(stud_id) |> unique() vi_tm_age <- involvements |> filter(cohort_name == "VI team") |> select(stud_id, cohort_name, open_date) |> left_join(attend |> filter(stud_id %in% vi_tm_inv_stud_id$stud_id) |> select(stud_id, dob) |> unique(), by = "stud_id" ) |> mutate(age = floor(as.numeric( (open_date - dob)/365.25))) |> group_by(age) |> summarise(child_count = n_distinct(stud_id)) ggplot(vi_tm_age, aes(x = age, y = child_count, label = child_count)) + geom_col(fill = "steelblue") + geom_text(colour = "steelblue", size = 3, vjust = -1)+ theme(axis.text.y = eb, axis.title.y = eb, axis.line.y = eb, axis.ticks.y = eb) + labs(title = "Age on involvement start - VI team", x = "age on involvement open date") #+ #scale_x_continuous(breaks = seq(0:16), limits = c(0,16)) ``` Involvement with the Visual Impairment team is associated with a small improvement in attendance: ```{r} #| label: plot VI team average attendance before and after #| fig-height: 3.5 inv_summary_long_start |> filter(cohort_name == "VI team") |> ggplot(aes(x = term_index, y = mean.percent_present ) ) + geom_point(colour = "steel blue", size = 2) + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Average attendance before and after Visual Impairment team", subtitle = "Mean percentage of available sessions attended +- 95 CI", caption = "data from Capita One", x = "half term (relative to start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") ``` The stacked categorised plot shows an improvement in the highest attendance category over the first year of involvement with the team: ```{r} #| label: categorised stack plot VI team stack_data <- involvements |> filter(cohort_name == "VI team") |> select(stud_id,starts_with("attend_cat")) |> pivot_longer(-stud_id,names_to = "index", values_to = "attend_cat") |> mutate(index = str_replace(index,"attend_cat_","")) |> mutate(index = str_replace(index,"start_","")) |> mutate(index = str_replace(index,"_"," ")) |> mutate(index = factor(index, levels = c("prior 12","prior 11","prior 10","prior 9","prior 8","prior 7","prior 6","prior 5","prior 4","prior 3","prior 2","prior 1","post 1","post 2","post 3","post 4","post 5","post 6","post 7","post 8","post 9","post 10","post 11","post 12"))) |> group_by(index, attend_cat) |> filter(attend_cat != "out of scope") |> filter(!attend_cat %in% c("out of scope","preschool","leaving age","covid")) |> tally() |> mutate(freq = n / sum(n)) |> filter(!is.na(index)) |> mutate(attend_cat = factor(attend_cat, levels = c("NA","0 - 50%","50 - 80%","80 - 90%","90 - 100%","preschool","leaving age","covid","out of scope"))) ggplot(stack_data, aes(x = index, y = freq, label = freq, fill = attend_cat, group = attend_cat)) + geom_col(position = position_stack(reverse = TRUE)) + #geom_text(position = position_stack(reverse = TRUE, vjust = 0.5), size = 2.5, angle = 90) + theme(axis.title.y = eb, legend.position = "right", axis.text.x = element_text(angle = 90, vjust = 0.5)) + scale_fill_manual(values = stack_colours, guide = guide_legend(reverse = TRUE)) + scale_y_continuous(labels = scales::percent) + labs(title = "Categorised attendance before & after - Visual Impairment team", subtitle = "Percentage of pupils in each category", x = "half term index (relative to involvement start date)", caption = "data from Capita One", fill = "") + geom_vline(xintercept = 12.5, linetype = "dashed", colour = "gray40") + annotate(geom = "text", label = str_wrap("inv start", width = 10), y = 1.05, x = 13.5, colour = "gray40", size = 2.5) + coord_cartesian(clip = "off") ``` As with the Hearing Impairment team, if we compare attendance levels for children with a Visual Impairment with and without the team involvement, we see consistently lower attendance through all school years for children involved with the team. This presumably reflects a higher level of need, or other factors. Children involved with the team do show improvements in attendance levels through primary school: ```{r} # filter attendance data to just get SEN kids Y5 and below comp_VI <- attend |> filter(year >= 2021) |> filter(primary_specific_need == "Visual Impairment") inv_VI <- involvements |> filter(cohort_name == "VI team") |> select(stud_id) |> mutate(inv_flag = 1) |> distinct() # join involvements to attendance data comp_VI <- comp_VI |> left_join(inv_VI, by = "stud_id") |> mutate(inv_flag = case_when( is.na(inv_flag) ~ 0,TRUE ~ inv_flag)) |> distinct() comp_VI_summary <- comp_VI |> group_by(inv_flag, ncy) |> summarise (mean.percent_present = mean(percent_present, na.rm = TRUE), sd.percent_present = sd(percent_present, na.rm = TRUE), n.percent_present = n() ) |> mutate(se.percent_present = sd.percent_present / sqrt(n.percent_present), lower.ci.percent_present = mean.percent_present - qt(1 - (0.05 / 2), n.percent_present - 1) * se.percent_present, upper.ci.percent_present = mean.percent_present + qt(1 - (0.05 / 2), n.percent_present - 1) * se.percent_present) # plot ggplot(comp_VI_summary |> filter(ncy > 0), # year zero is an outlier here aes(x = ncy, y = mean.percent_present, colour = factor(inv_flag))) + geom_point() + geom_line(linetype = "dotted") + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2)+ geom_text(aes(label = scales::percent(round(mean.percent_present,3), accuracy = 0.1)), vjust = 3, colour = "white", size = 3, position = position_dodge(0.9)) + labs(title = "Attendance by curriculum year without and with Visual Impairment team involvement", subtitle = "Pupils with a primary specific need of visual impairment; avg % of available sessions attended +- 95 CI", x = "new curriculum year")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(labels = seq(0,11), breaks = seq(0,11)) + theme(axis.title.x = eb, plot.title = element_markdown(size = 12), legend.position = "none" ) + MetBrewer::scale_fill_met_d("Egypt") ``` ## UCAN (this description copied from Sheffield Directory) Sheffield Early Years Language Centre (Ucan) is funded jointly by the NHS and Sheffield Local Authority. The centre is staffed by speech and language therapists from the NHS and a teacher and assistant from the 0-5 SEND Support Service. The centre provides intensive early intervention for pre-school children with identified developmental language disorder (DLD) and training for parents and Early Years practitioners in meeting children's needs.(Please see more detailed information below about admissions criteria and the work of the centre). ```{r} #| label: UCAN volumes & age on first start ##| out-width: 75% ##| fig-height: 3 #| fig-align: "center" #| fig-ncol: 2 # plot volumes involvements |> filter(cohort_name == "UCAN", exam_year_inv_start >= 2019, exam_year_inv_start <= 2024) |> group_by(exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "UCAN - activity", subtitle = "involvements starting by accademic year; 2025 is part year only", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2025)) + coord_cartesian(clip = "off") ucan_inv_stud_id <- involvements |> filter(cohort_name == "UCAN") |> select(stud_id) |> unique() ucan_age <- involvements |> filter(cohort_name == "UCAN") |> select(stud_id, cohort_name, open_date) |> left_join(attend |> filter(stud_id %in% ucan_inv_stud_id$stud_id) |> select(stud_id, dob) |> unique(), by = "stud_id" ) |> mutate(age = floor(as.numeric( (open_date - dob)/365.25))) |> group_by(age) |> summarise(child_count = n_distinct(stud_id)) ggplot(ucan_age, aes(x = age, y = child_count, label = child_count)) + geom_col(fill = "steelblue") + geom_text(colour = "steelblue", size = 3, vjust = -1)+ theme(axis.text.y = eb, axis.title.y = eb, axis.line.y = eb, axis.ticks.y = eb) + labs(title = "Age on involvement start - UCAN", x = "age on involvement open date") #+ #scale_x_continuous(breaks = seq(0:16), limits = c(0,16)) ``` Since involvement with UCAN is exclusively prior to school starting age, we cannot do the 'before & after' analysis. The majority of children involved with UCAN have special educational needs: ```{r} #| label: table of UCAN count by primary need #| layout: [45, -10, 45] involvements |> filter( cohort_name == "Portage", !is.na(sen_level)) |> group_by( sen_level ) |> summarise(child_count = n_distinct(stud_id)) |> arrange(desc(child_count)) |> gt(rowname_col = "sen_level") |> tab_header( title = "Portage", subtitle = "counts of children with involvements in 2024, by SEN level" ) |> tab_options( table.align = "left", heading.title.font.size = 12, heading.subtitle.font.size= 10, heading.align = "left", column_labels.font.size = 14, stub.font.size = 12, column_labels.hidden = TRUE ) |> cols_align("left",'sen_level') involvements |> filter(cohort_name == "UCAN", !is.na(imd_quartile)) |> group_by(imd_quartile) |> summarise(child_count = n_distinct(stud_id)) |> gt(rowname_col = "sen_level") |> tab_header( title = "UCAN - counts by IMD quartile", subtitle = "4 = most deprived wards; 1 = least deprived" ) |> tab_options( table.align = "left", heading.title.font.size = 12, heading.subtitle.font.size= 10, heading.align = "left", column_labels.font.size = 14, stub.font.size = 12, column_labels.hidden = TRUE ) |> cols_align("left",'imd_quartile') ``` The majority have speech, language and communication needs: ```{r} #| label: UCAN table of counts by primary specific need involvements |> filter( cohort_name == "UCAN", #!is.na(sen_level) ) |> group_by( primary_specific_need ) |> summarise(child_count = n_distinct(stud_id)) |> arrange(desc(child_count)) |> gt(rowname_col = "primary_specific_need") |> tab_header( title = "UCAN", subtitle = "counts of children with involvements in 2024, by primary specific_need" ) |> tab_options( table.align = "left", heading.title.font.size = 12, heading.subtitle.font.size= 10, heading.align = "left", column_labels.font.size = 14, stub.font.size = 12, column_labels.hidden = TRUE ) |> cols_align("left",'primary_specific_need') ``` Some of this may be due to the recency of the UCAN involvements and the general recovery in attendance for younger children, but the data shows that children with UCAN involvement show better attendance in primary than their peers who have similar needs: ```{r} #| label: UCAN effectiveness by characteristic group #| fig-height: 8 UCAN_stud_id <- involvements |> filter(cohort_name == "UCAN") |> select(stud_id) |> distinct() |> mutate(UCAN = TRUE) attend_pri <- attend |> filter( ncy %in% c(0,1,2,3,4,5,6) & year >= 2021 ) ucan_sen_support <- attend_pri |> left_join(UCAN_stud_id, by = "stud_id") |> filter(sen_level == "SEN Support" ) |> mutate(UCAN = if_else(is.na(UCAN),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","UCAN")) |> group_by(ncy,UCAN) |> summarise_avg() |> mutate(category = "SEN Support") ucan_ehcp <- attend_pri |> left_join(UCAN_stud_id, by = "stud_id") |> filter(sen_level == "EHCP" ) |> mutate(UCAN = if_else(is.na(UCAN),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","UCAN")) |> group_by(ncy,UCAN) |> summarise_avg() |> mutate(category = "EHCP") ucan_slcn <- attend_pri |> left_join(UCAN_stud_id, by = "stud_id") |> filter(primary_specific_need == "Speech, Language And Communication Needs" ) |> mutate(UCAN = if_else(is.na(UCAN),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","UCAN")) |> group_by(ncy,UCAN) |> summarise_avg() |> mutate(category = "Speech, Language And Communication Needs") ucan_asd <- attend_pri |> left_join(UCAN_stud_id, by = "stud_id") |> filter(primary_specific_need == "Autism" ) |> mutate(UCAN = if_else(is.na(UCAN),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","UCAN")) |> group_by(ncy,UCAN) |> summarise_avg() |> mutate(category = "Autism") ucan_ehcp_ss_deprived <- attend_pri |> left_join(UCAN_stud_id, by = "stud_id") |> filter(ward_imd_score >= 40, sen_level %in% c("EHCP","SEN Support")) |> mutate(UCAN = if_else(is.na(UCAN),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","UCAN")) |> group_by(ncy,UCAN) |> summarise_avg() |> mutate(category = "SEN support or EHCP (most deprived wards)") rbind( #port_overall, ucan_sen_support, ucan_ehcp_ss_deprived, ucan_ehcp, ucan_slcn#, #ucan_asd ) |> mutate(category = factor(category, levels = c("SEN Support", "EHCP", "Speech, Language And Communication Needs", #"Autism"#, "SEN support or EHCP (most deprived wards)" ) )) |> filter(n.percent_present > 1) |> ggplot(aes(x = ncy, y = mean.percent_present, colour = UCAN, group = UCAN)) + geom_point() + geom_line(alpha = 0.7 ) + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2)+ geom_text(aes(label = scales::percent(round(mean.percent_present,3))), vjust = 2, colour = "white", size = 4, position = position_dodge(0.9)) + geom_vline(aes(xintercept = 6.5), linetype = "dotted", colour = "gray") + facet_wrap(vars(category)) + labs(title = "Attendance through years 0 to 6 - pupils without and with UCAN involvement", subtitle = "Average percentage of available sessions attended since 2021; +- 95 CI", caption = "data from Capita One") + scale_y_continuous(labels = scales::percent) + scale_x_continuous(labels = seq(0,6), breaks = seq(0,6)) + barplottheme_minimal + theme(axis.title.x = eb, #axis.text.y = eb, legend.position = "none", strip.background = eb, strip.text = element_text(size = 8), plot.title = element_markdown(size = 12)) + MetBrewer::scale_fill_met_d("Egypt") ``` ## I&A - School Readiness :::{.callout-warning} Looking at the age profile, there is clearly a data quality issue with some of these involvements. ::: ```{r} #| label: I A School Readiness volumes & age on first start ##| out-width: 75% ##| fig-height: 3 #| fig-align: "center" #| fig-ncol: 2 # plot volumes involvements |> filter(cohort_name == "I&A - School Readiness", exam_year_inv_start >= 2010, exam_year_inv_start <= 2024) |> group_by(exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "I&A - School Readiness activity", subtitle = "involvements starting by accademic year; 2025 is part year only", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2025)) + coord_cartesian(clip = "off") ia_sr_inv_stud_id <- involvements |> filter(cohort_name == "I&A - School Readiness") |> select(stud_id) |> unique() ia_sr_inclusion_age <- involvements |> filter(cohort_name == "I&A - School Readiness") |> select(stud_id, cohort_name, open_date) |> left_join(attend |> filter(stud_id %in% ia_sr_inv_stud_id$stud_id) |> select(stud_id, dob) |> unique(), by = "stud_id" ) |> mutate(age = floor(as.numeric( (open_date - dob)/365.25))) |> group_by(age) |> summarise(child_count = n_distinct(stud_id)) ggplot(ia_sr_inclusion_age, aes(x = age, y = child_count, label = child_count)) + geom_col(fill = "steelblue") + geom_text(colour = "steelblue", size = 3, vjust = -1)+ theme(axis.text.y = eb, axis.title.y = eb, axis.line.y = eb, axis.ticks.y = eb) + labs(title = "Age on involvement start - I&A - School Readiness", x = "age on involvement open date") #+ #scale_x_continuous(breaks = seq(0:16), limits = c(0,16)) ``` The team work more with children in more deprived areas, and those who require SEN support. ```{r} #| label: table of I A school readiness team count by characteristics #| layout: [45, -10, 45] involvements |> filter( ##exam_year_inv_start == 2024, cohort_name == "I&A - School Readiness", !is.na(sen_level)) |> group_by( #sen_need_category sen_level ) |> summarise(child_count = n_distinct(stud_id)) |> arrange(desc(child_count)) |> gt(rowname_col = "sen_level") |> tab_header( title = "I&A - School Readiness", subtitle = "counts of children with involvements in 2024, by SEN level" ) |> tab_options( table.align = "left", heading.title.font.size = 12, heading.subtitle.font.size= 10, heading.align = "left", column_labels.font.size = 14, stub.font.size = 12, column_labels.hidden = TRUE ) |> cols_align("left",'sen_level') involvements |> filter(cohort_name == "I&A - School Readiness", !is.na(imd_quartile)) |> group_by(imd_quartile) |> summarise(child_count = n_distinct(stud_id)) |> arrange(desc(child_count)) |> gt(rowname_col = "sen_level") |> tab_header( title = "I&A - School Readiness - counts by IMD quartile", subtitle = "4 = most deprived wards; 1 = least deprived" ) |> tab_options( table.align = "left", heading.title.font.size = 12, heading.subtitle.font.size= 10, heading.align = "left", column_labels.font.size = 14, stub.font.size = 12, column_labels.hidden = TRUE ) |> cols_align("left",'imd_quartile') ``` ```{r} #| label: table of I A school readiness count by primary need involvements |> filter( cohort_name == "I&A - School Readiness" ) |> group_by( primary_specific_need ) |> summarise(child_count = n_distinct(stud_id)) |> arrange(desc(child_count)) |> gt(rowname_col = "primary_specific_need") |> tab_header( title = "I&A - School Readiness", subtitle = "counts of children with involvements in 2024, by primary specific_need" ) |> tab_options( table.align = "left", heading.title.font.size = 12, heading.subtitle.font.size= 10, heading.align = "left", column_labels.font.size = 14, stub.font.size = 12, column_labels.hidden = TRUE ) |> cols_align("left",'primary_specific_need') ``` If we look at all pupils, we see that children who the school readiness team work with tend to have lower attendance throughout primary school, including a bigger dropoff into Y6. Children involved with the team show an overall sustained improvement in their attendance through primary school: ```{r} #| label: plot primary attendance with and without I A school Readiness involvement #| fig-height: 3.5 ia_sr_inv_stud_id <- involvements |> filter(cohort_name == "I&A - School Readiness") |> select(stud_id) |> distinct() |> mutate(ia_sr_team = TRUE) attend_pri <- attend |> filter( ncy %in% c(1,2,3,4,5,6) & year >= 2021 ) ia_sr_primary <- attend_pri |> left_join(ia_sr_inv_stud_id, by = "stud_id") |> mutate(ia_sr_team = if_else(is.na(ia_sr_team),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","ia_sr_team")) |> group_by(ncy,ia_sr_team) |> summarise_avg() |> mutate(category = "overall") ggplot(ia_sr_primary, aes(x = ncy, y = mean.percent_present, colour = ia_sr_team, group = ia_sr_team) ) + geom_point() + geom_line(alpha = 0.7 ) + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2)+ geom_vline(aes(xintercept = 6.5), linetype = "dotted", colour = "gray") + labs(title = "Attendance through primary - pupils without and with I&A school readiness involvement", subtitle = "Average percentage of available sessions attended since 2021; +- 95 CI", caption = "data from Capita One") + barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(labels = seq(1,6), breaks = seq(1,6)) + theme(axis.title.x = eb, legend.position = "none", strip.background = eb, strip.text = element_text(size = 8), plot.title = element_markdown(size = 12)) + MetBrewer::scale_fill_met_d("Egypt") ``` Breaking this down for the same characteristic groups we used for various other involvements above, we see generally lower attendance for the children accessing the service, which probably reflects factors not captured in the data - i.e. significant need beyond what we're controlling for here. The exception seems to be children with Autism, who appear to significantly benefit in attendance. ```{r} #| label: I A School Readiness effectiveness by characteristic group #| fig-height: 8 attend_sr <- attend |> filter( ncy %in% c(0,1,2,3,4,5,6) & year >= 2021 ) sr_overall <- attend_sr |> left_join(ia_sr_inv_stud_id, by = "stud_id") |> mutate(ia_sr_team = if_else(is.na(ia_sr_team),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","ia_sr_team")) |> group_by(ncy,ia_sr_team) |> summarise_avg() |> mutate(category = "all pupils") sr_deprived <- attend_sr |> left_join(ia_sr_inv_stud_id, by = "stud_id") |> filter(ward_imd_score >= 40 ) |> mutate(ia_sr_team = if_else(is.na(ia_sr_team),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","ia_sr_team")) |> group_by(ncy,ia_sr_team) |> summarise_avg() |> mutate(category = "most deprived wards") sr_sen_support <- attend_sr |> left_join(ia_sr_inv_stud_id, by = "stud_id") |> filter(sen_level == "SEN Support" ) |> mutate(ia_sr_team = if_else(is.na(ia_sr_team),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","ia_sr_team")) |> group_by(ncy,ia_sr_team) |> summarise_avg() |> mutate(category = "SEN Support") sr_ehcp <- attend_sr |> left_join(ia_sr_inv_stud_id, by = "stud_id") |> filter(sen_level == "EHCP" ) |> mutate(ia_sr_team = if_else(is.na(ia_sr_team),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","ia_sr_team")) |> group_by(ncy,ia_sr_team) |> summarise_avg() |> mutate(category = "EHCP") sr_slcn <- attend_pri |> left_join(ia_sr_inv_stud_id, by = "stud_id") |> filter(primary_specific_need == "Speech, Language And Communication Needs" ) |> mutate(ia_sr_team = if_else(is.na(ia_sr_team),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","ia_sr_team")) |> group_by(ncy,ia_sr_team) |> summarise_avg() |> mutate(category = "Speech, Language And Communication Needs") sr_asd <- attend_pri |> left_join(ia_sr_inv_stud_id, by = "stud_id") |> filter(primary_specific_need == "Autism" ) |> mutate(ia_sr_team = if_else(is.na(ia_sr_team),FALSE,TRUE)) |> summarise_attendance(grouping_vars = c("stud_id","ncy","ia_sr_team")) |> group_by(ncy,ia_sr_team) |> summarise_avg() |> mutate(category = "Autism") # gather and plot rbind( sr_overall, sr_sen_support, sr_deprived, sr_ehcp, sr_slcn, sr_asd ) |> mutate(category = factor(category, levels = c("all pupils", "most deprived wards", "SEN Support", "EHCP", "Speech, Language And Communication Needs", "Autism") )) |> ggplot(aes(x = ncy, y = mean.percent_present, colour = ia_sr_team, group = ia_sr_team)) + geom_point() + geom_line(alpha = 0.7 ) + #linetype = "dotted") + #geom_text(aes(label = n.percent_present)) + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2)+ geom_text(aes(label = scales::percent(round(mean.percent_present,3))), vjust = 2, colour = "white", size = 4, position = position_dodge(0.9)) + geom_vline(aes(xintercept = 6.5), linetype = "dotted", colour = "gray") + facet_wrap(vars(category)) + labs(title = "Attendance through years 4 to 8 - pupils without and with I&A School Readiness involvement", subtitle = "Average percentage of available sessions attended since 2021; +- 95 CI", caption = "data from Capita One") + barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(labels = seq(0,6), breaks = seq(0,6)) + theme(axis.title.x = eb, legend.position = "none", strip.background = eb, strip.text = element_text(size = 8), plot.title = element_markdown(size = 12)) + MetBrewer::scale_fill_met_d("Egypt") ``` ## S437- School Att Order EHE Attendance orders related to electively home educated children. In 2024 Sheffield City Council has just 35 of these recorded. The orders can be issued to the parents of children of any age. ```{r} #| label: S437 volumes & age on first start ##| out-width: 75% ##| fig-height: 3 #| fig-align: "center" #| fig-ncol: 2 # plot volumes involvements |> filter(cohort_name == "S437- School Att Order EHE", exam_year_inv_start >= 2010, exam_year_inv_start <= 2024) |> group_by(exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "I&A - School Readiness activity", subtitle = "involvements starting by accademic year; 2025 is part year only", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2025)) + coord_cartesian(clip = "off") S437_inv_stud_id <- involvements |> filter(cohort_name == "S437- School Att Order EHE") |> select(stud_id) |> unique() S437_age <- involvements |> filter(cohort_name == "S437- School Att Order EHE") |> select(stud_id, cohort_name, open_date) |> left_join(attend |> filter(stud_id %in% S437_inv_stud_id$stud_id) |> select(stud_id, dob) |> unique(), by = "stud_id" ) |> mutate(age = floor(as.numeric( (open_date - dob)/365.25))) |> group_by(age) |> summarise(child_count = n_distinct(stud_id)) ggplot(S437_age, aes(x = age, y = child_count, label = child_count)) + geom_col(fill = "steelblue") + geom_text(colour = "steelblue", size = 3, vjust = -1)+ theme(axis.text.y = eb, axis.title.y = eb, axis.line.y = eb, axis.ticks.y = eb) + labs(title = "Age on involvement start - S437- School Attendance Order EHE", x = "age on involvement open date") #+ #scale_x_continuous(breaks = seq(0:16), limits = c(0,16)) ``` S437 Attendance Orders are followed by a sudden and significant improvement in attendance levels. ```{r} #| label: plot S437 School Att Order EHE average attendance before and after inv_summary_long_start |> filter(cohort_name == "S437- School Att Order EHE") |> ggplot(aes(x = term_index, y = mean.percent_present ) ) + geom_point(colour = "steel blue", size = 2) + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Average attendance before and after S437- School Attendance Order EHE", subtitle = "Mean percentage of available sessions attended +- 95 CI", caption = "data from Capita One", x = "half term (relative to start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent, limits = c(0,1)) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") ``` Behind those averages though, there are still large volumes of children with entirely missing attendance data - are these those still being home educated? ```{r} # stacked bar of % category stack_data <- involvements |> filter(cohort_name == "S437- School Att Order EHE" ) |> select(stud_id,starts_with("attend_cat")) |> pivot_longer(-stud_id,names_to = "index", values_to = "attend_cat") |> mutate(index = str_replace(index,"attend_cat_","")) |> mutate(index = str_replace(index,"start_","")) |> mutate(index = str_replace(index,"_"," ")) |> mutate(index = factor(index, levels = c("prior 12","prior 11","prior 10","prior 9","prior 8","prior 7","prior 6","prior 5","prior 4","prior 3","prior 2","prior 1","post 1","post 2","post 3","post 4","post 5","post 6","post 7","post 8","post 9","post 10","post 11","post 12"))) |> group_by(index, attend_cat) |> filter(attend_cat != "out of scope") |> filter(attend_cat != "preschool") |> #filter(!attend_cat %in% c("out of scope","preschool","leaving age","covid")) |> tally() |> mutate(freq = n / sum(n)) |> filter(!is.na(index)) |> mutate(attend_cat = factor(attend_cat, levels = c("NA","0 - 50%","50 - 80%","80 - 90%","90 - 100%","preschool","leaving age","covid","out of scope"))) ggplot(stack_data, aes(x = index, y = freq, label = freq, fill = attend_cat, group = attend_cat)) + geom_col(position = position_stack(reverse = TRUE))+ #geom_text(position = position_stack(reverse = TRUE, vjust = 0.5), size = 2.5, angle = 90) + theme(axis.title.y = eb, legend.position = "right", axis.text.x = element_text(angle = 90, vjust = 0.5)) + scale_fill_manual(values = stack_colours, guide = guide_legend(reverse = TRUE)) + scale_y_continuous(labels = scales::percent) + labs(title = "Categorised attendance before & after - S437 - School Att Order CME", subtitle = "Percentage of pupils in each category", x = "half term index (relative to involvement start date)", fill = "") + geom_vline(xintercept = 12.5, linetype = "dashed", colour = "gray40") + annotate(geom = "text", label = str_wrap("inv start", width = 10), y = 1.05, x = 13.5, colour = "gray40", size = 2.5) + coord_cartesian(clip = "off") ``` ## S437 - School Att Order CME School attendance orders relating to a Child Missing Education (CME). As with the EHE attendance orders, Sheffield issues just a few dozen of these per year. CME orders or mostly issued to parents of secondary school pupils: ```{r} #| label: S437 School Att Order CME volumes & age on first start ##| out-width: 75% ##| fig-height: 3 #| fig-align: "center" #| fig-ncol: 2 # plot volumes involvements |> filter(cohort_name == "S437 - School Att Order CME", exam_year_inv_start >= 2010, exam_year_inv_start <= 2024) |> group_by(exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "S437 - School Att Order CME activity", subtitle = "involvements starting by accademic year; 2025 is part year only", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2025)) + coord_cartesian(clip = "off") S437_inv_stud_id <- involvements |> filter(cohort_name == "S437 - School Att Order CME") |> select(stud_id) |> unique() S437_age <- involvements |> filter(cohort_name == "S437 - School Att Order CME") |> select(stud_id, cohort_name, open_date) |> left_join(attend |> filter(stud_id %in% S437_inv_stud_id$stud_id) |> select(stud_id, dob) |> unique(), by = "stud_id" ) |> mutate(age = floor(as.numeric( (open_date - dob)/365.25))) |> group_by(age) |> summarise(child_count = n_distinct(stud_id)) ggplot(S437_age, aes(x = age, y = child_count, label = child_count)) + geom_col(fill = "steelblue") + geom_text(colour = "steelblue", size = 3, vjust = -1)+ theme(axis.text.y = eb, axis.title.y = eb, axis.line.y = eb, axis.ticks.y = eb) + labs(title = "Age on involvement start - S437 - School Att Order CME", x = "age on involvement open date") #+ #scale_x_continuous(breaks = seq(0:16), limits = c(0,16)) ``` In the 'before & after' analysis, we see that prior half terms are blank due to children missing school entirely for this period - hence the attendance order. Average attendance after the attendance orders improves dramatically (but remains below overall average levels). (note that some half term -1 data here was removed, presumed incorrectly indexed on the wrong date) ```{r} #| label: plot S437 School Att Order CME average attendance before and after inv_summary_long_start |> filter(cohort_name == "S437 - School Att Order CME") |> filter(term_index != -1) |> ggplot(aes(x = term_index, y = mean.percent_present ) ) + geom_point(colour = "steel blue", size = 2) + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Average attendance before and after S437 - School Attendance Order CME", subtitle = "Mean percentage of available sessions attended +- 95 CI", caption = "data from Capita One", x = "half term (relative to start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") ``` ```{r} #| eval: false ridge_present <- involvements |> filter(cohort_name == "S437 - School Att Order CME") |> select(-contains("year")) |> select(stud_id,starts_with("percent_present_prior"), starts_with("percent_present_start_post")) |> pivot_longer(-stud_id,names_to = "index", values_to = "present") |> mutate(present = replace_na(present,0)) |> mutate(present = present * 100) |> mutate(index = str_replace(index,"percent_present_","")) |> mutate(index = str_replace(index,"start_","")) |> mutate(index = factor(index, levels = c("prior_12","prior_11","prior_10","prior_9","prior_8","prior_7","prior_6","prior_5","prior_4","prior_3","prior_2","prior_1","post_1","post_2","post_3","post_4","post_5","post_6","post_7","post_8","post_9","post_10","post_11","post_12"))) ggplot(ridge_present, aes(x = present, y = fct_rev(index), fill = index)) + geom_density_ridges(alpha = 0.5)+ scale_x_continuous(limits = c(0,100)) + theme(axis.title.y = eb, legend.position = "none") + labs(title = "Distribution of attendance in half term periods relative to Attendance Advice start date") ``` ```{r} #library(rlang) sankey_fun <- function(input_data = involvements, cohort_name_selection, prior_interval, post_interval, attend_only = FALSE ) { attend_only_list = c("0 - 50%","50 - 80%","80 - 90%","90 - 100%") prior_col <- str_c("attend_cat_prior_",as.character(prior_interval)) post_col <- str_c("attend_cat_start_post_",as.character(post_interval)) output_all <- input_data |> filter(cohort_name == cohort_name_selection) |> mutate(attend_cat_prior = paste0(.data[[prior_col]]," prior"), attend_cat_post = paste0(.data[[post_col]]," post") ) |> group_by(attend_cat_prior, attend_cat_post) |> tally() |> mutate(sankey_code = str_c(attend_cat_prior," [",n,"] ",attend_cat_post)) output_attend_only <- input_data |> filter(cohort_name == cohort_name_selection, .data[[prior_col]] %in% attend_only_list, .data[[post_col]] %in% attend_only_list ) |> mutate(attend_cat_prior = paste0(.data[[prior_col]]," prior"), attend_cat_post = paste0(.data[[post_col]]," post") ) |> group_by(attend_cat_prior, attend_cat_post) |> tally() |> mutate(sankey_code = str_c(attend_cat_prior," [",n,"] ",attend_cat_post)) ifelse(attend_only == TRUE, {output <- output_attend_only}, {output <- output_all} ) write.table(output$sankey_code, file = "clipboard-20000", row.names = F, col.names = F, quote=F, sep = "\t") } sankey_fun( input_data = involvements, cohort_name_selection = "S437 - School Att Order CME", prior_interval = 6, post_interval = 6, attend_only = FALSE) ``` All coded absence reasons show a large decrease following the CME attendance orders: ```{r} #| label: plot before & after by reason code S437 school att order CME #| fig-height: 4 before_after_year_fun(cohort_name_selection = "S437 - School Att Order CME") |> ggplot(aes(x = name, y = mean, fill = prior_post, label = scales::percent(mean, accuracy = 0.01))) + geom_col(position = "dodge") + geom_errorbar(aes(ymax = upper_ci,ymin = lower_ci), width = 0.2, alpha = 0.5, position = position_dodge(width = 0.9), colour = "gray20") + geom_text(position = position_dodge(width = 0.9), vjust = -0.2, size = 3) + scale_fill_met_d("Egypt") + scale_colour_met_d("Egypt") + scale_y_continuous(labels = scales::percent) + barplottheme_minimal + labs(title = "Absence reasons before and after S437 - School Att Order CME", subtitle = "Average % of sessions missed one year either side of involvement or episode start date", caption = "data from Capita One") + theme(plot.title = element_markdown(size = 12), legend.position = "none", eb, axis.text.y = eb) + scale_x_discrete(labels = function(x) str_wrap(x, width = 2)) ``` The categorised attendance data shows a dramatic change following this involvement, with a big reduction in the NA category and increases across all 'present' categories - though there are also signs that effects have a limited timespan, as the % in the highest attendance category drops away, and the NA category increases once again. ```{r} # stacked bar of % category stack_data <- involvements |> filter(cohort_name == "S437 - School Att Order CME") |> select(stud_id,starts_with("attend_cat")) |> pivot_longer(-stud_id,names_to = "index", values_to = "attend_cat") |> mutate(index = str_replace(index,"attend_cat_","")) |> mutate(index = str_replace(index,"start_","")) |> mutate(index = str_replace(index,"_"," ")) |> mutate(index = factor(index, levels = c("prior 12","prior 11","prior 10","prior 9","prior 8","prior 7","prior 6","prior 5","prior 4","prior 3","prior 2","prior 1","post 1","post 2","post 3","post 4","post 5","post 6","post 7","post 8","post 9","post 10","post 11","post 12"))) |> group_by(index, attend_cat) |> filter(attend_cat != "out of scope") |> filter(attend_cat != "preschool") |> filter(!attend_cat %in% c("out of scope","preschool","leaving age","covid")) |> tally() |> mutate(freq = n / sum(n)) |> filter(!is.na(index)) |> mutate(attend_cat = factor(attend_cat, levels = c("NA","0 - 50%","50 - 80%","80 - 90%","90 - 100%","preschool","leaving age","covid","out of scope"))) ggplot(stack_data, aes(x = index, y = freq, label = freq, fill = attend_cat, group = attend_cat)) + geom_col(position = position_stack(reverse = TRUE))+ #geom_text(position = position_stack(reverse = TRUE, vjust = 0.5), size = 2.5, angle = 90) + theme(axis.title.y = eb, legend.position = "right", axis.text.x = element_text(angle = 90, vjust = 0.5)) + scale_fill_manual(values = stack_colours, guide = guide_legend(reverse = TRUE)) + scale_y_continuous(labels = scales::percent) + labs(title = "Categorised attendance before & after - S437 - School Att Order CME", subtitle = "Percentage of pupils in each category", x = "half term index (relative to involvement start date)", fill = "") + geom_vline(xintercept = 12.5, linetype = "dashed", colour = "gray40") + annotate(geom = "text", label = str_wrap("inv start", width = 10), y = 1.05, x = 13.5, colour = "gray40", size = 2.5) + coord_cartesian(clip = "off") ``` ## I&A - Reintegration ```{r} #| label: I A Reintegration volumes & age on first start ##| out-width: 75% ##| fig-height: 3 #| fig-align: "center" #| fig-ncol: 2 # plot volumes involvements |> filter(cohort_name == "I&A - Reintegration", exam_year_inv_start >= 2010, exam_year_inv_start <= 2024) |> group_by(exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "I&A - Reintegration activity", subtitle = "involvements starting by accademic year; 2025 is part year only", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2025)) + coord_cartesian(clip = "off") ia_reint_inv_stud_id <- involvements |> filter(cohort_name == "I&A - Reintegration") |> select(stud_id) |> unique() ia_reint_age <- involvements |> filter(cohort_name == "I&A - Reintegration") |> select(stud_id, cohort_name, open_date) |> left_join(attend |> filter(stud_id %in% ia_reint_inv_stud_id$stud_id) |> select(stud_id, dob) |> unique(), by = "stud_id" ) |> mutate(age = floor(as.numeric( (open_date - dob)/365.25))) |> group_by(age) |> summarise(child_count = n_distinct(stud_id)) ggplot(ia_reint_age, aes(x = age, y = child_count, label = child_count)) + geom_col(fill = "steelblue") + geom_text(colour = "steelblue", size = 3, vjust = -1)+ theme(axis.text.y = eb, axis.title.y = eb, axis.line.y = eb, axis.ticks.y = eb) + labs(title = "Age on involvement start - I&A - Reintegration", x = "age on involvement open date") #+ #scale_x_continuous(breaks = seq(0:16), limits = c(0,16)) ``` ```{r} #| label: plot reintegration avg before and after inv_summary_long_start |> filter(cohort_name == "I&A - Reintegration") |> ggplot(aes(x = term_index, y = mean.percent_present ) ) + geom_point(colour = "steel blue", size = 2) + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Average attendance before and after I&A - Reintegration", subtitle = "Mean percentage of available sessions attended +- 95 CI", caption = "data from Capita One", x = "half term (relative to start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") ``` The plot above shows a significant shift immedately after the involvement. The categorised reasons show a similar shift whether we look at it half-term to half-term or year on year - a reduction in overall absences, mostly driven by a reduction in exclusions, while illness & lateness levels rise slightly: ```{r} #| label: plot before & after by reason code I A reintegration #| fig-height: 4 before_after_ht_fun(cohort_name_selection = "I&A - Reintegration") |> ggplot(aes(x = name, y = mean, fill = prior_post, label = scales::percent(mean, accuracy = 0.01))) + geom_col(position = "dodge") + geom_errorbar(aes(ymax = upper_ci,ymin = lower_ci), width = 0.2, alpha = 0.5, position = position_dodge(width = 0.9), colour = "gray20") + geom_text(position = position_dodge(width = 0.9), vjust = -0.2, size = 3) + scale_fill_met_d("Egypt") + scale_colour_met_d("Egypt") + scale_y_continuous(labels = scales::percent) + barplottheme_minimal + labs(title = "Absence reasons before and after I&A - Reintegration", subtitle = "Average % of sessions missed one year either side of involvement or episode start date", caption = "data from Capita One") + theme(plot.title = element_markdown(size = 12), legend.position = "none", eb, axis.text.y = eb) + scale_x_discrete(labels = function(x) str_wrap(x, width = 2)) ``` Finally, the stacked categorised bar chart shows more detail than the average before & after plot - with the most significant movement being between those severely absent and those in the highest attendance bracket (although these fall away over time): ```{r} #| label: categorised stacked bar I A reintegration # stacked bar of % category stack_data <- involvements |> filter(cohort_name == "I&A - Reintegration") |> select(stud_id,starts_with("attend_cat")) |> pivot_longer(-stud_id,names_to = "index", values_to = "attend_cat") |> mutate(index = str_replace(index,"attend_cat_","")) |> mutate(index = str_replace(index,"start_","")) |> mutate(index = str_replace(index,"_"," ")) |> mutate(index = factor(index, levels = c("prior 12","prior 11","prior 10","prior 9","prior 8","prior 7","prior 6","prior 5","prior 4","prior 3","prior 2","prior 1","post 1","post 2","post 3","post 4","post 5","post 6","post 7","post 8","post 9","post 10","post 11","post 12"))) |> group_by(index, attend_cat) |> filter(attend_cat != "out of scope") |> #filter(attend_cat != "preschool") |> #filter(!attend_cat %in% c("out of scope","preschool","leaving age","covid")) |> tally() |> mutate(freq = n / sum(n)) |> filter(!is.na(index)) |> mutate(attend_cat = factor(attend_cat, levels = c("NA","0 - 50%","50 - 80%","80 - 90%","90 - 100%","preschool","leaving age","covid","out of scope"))) ggplot(stack_data, aes(x = index, y = freq, label = freq, fill = attend_cat, group = attend_cat)) + geom_col(position = position_stack(reverse = TRUE))+ #geom_text(position = position_stack(reverse = TRUE, vjust = 0.5), size = 2.5, angle = 90) + theme(axis.title.y = eb, legend.position = "right", axis.text.x = element_text(angle = 90, vjust = 0.5)) + scale_fill_manual(values = stack_colours, guide = guide_legend(reverse = TRUE)) + scale_y_continuous(labels = scales::percent) + labs(title = "Categorised attendance before & after I&A - Reintegration", subtitle = "Percentage of pupils in each category", x = "half term index (relative to involvement start date)", fill = "") + geom_vline(xintercept = 12.5, linetype = "dashed", colour = "gray40") + annotate(geom = "text", label = str_wrap("inv start", width = 10), y = 1.05, x = 13.5, colour = "gray40", size = 2.5) + coord_cartesian(clip = "off") ``` ## School Attendance Order Breach ```{r} #| label: school attendance order breach volumes & age on first start ##| out-width: 75% ##| fig-height: 3 #| fig-align: "center" #| fig-ncol: 2 # plot volumes involvements |> filter(cohort_name == "School Att Order Breach", exam_year_inv_start >= 2010, exam_year_inv_start <= 2024) |> group_by(exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "School Attendance Order Breach volumes", subtitle = "involvements starting by accademic year; 2025 is part year only", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2025)) + coord_cartesian(clip = "off") saob_inv_stud_id <- involvements |> filter(cohort_name == "School Att Order Breach") |> select(stud_id) |> unique() saob_age <- involvements |> filter(cohort_name == "School Att Order Breach") |> select(stud_id, cohort_name, open_date) |> left_join(attend |> filter(stud_id %in% saob_inv_stud_id$stud_id) |> select(stud_id, dob) |> unique(), by = "stud_id" ) |> mutate(age = floor(as.numeric( (open_date - dob)/365.25))) |> group_by(age) |> summarise(child_count = n_distinct(stud_id)) ggplot(saob_age, aes(x = age, y = child_count, label = child_count)) + geom_col(fill = "steelblue") + geom_text(colour = "steelblue", size = 3, vjust = -1)+ theme(axis.text.y = eb, axis.title.y = eb, axis.line.y = eb, axis.ticks.y = eb) + labs(title = "Age on involvement start - School Attendance Order Breach", x = "age on involvement open date") #+ #scale_x_continuous(breaks = seq(0:16), limits = c(0,16)) ``` ```{r} #| label: plot school attendance order breach avg before and after inv_summary_long_start |> filter(cohort_name == "School Att Order Breach") |> ggplot(aes(x = term_index, y = mean.percent_present ) ) + geom_point(colour = "steel blue", size = 2) + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Average attendance before and after School Attendance Order Breach", subtitle = "Mean percentage of available sessions attended +- 95 CI", caption = "data from Capita One", x = "half term (relative to start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") ``` The plot above shows a significant shift immedately after the involvement. The categorised reasons show a similar shift whether we look at it half-term to half-term or year on year - a reduction in overall absences, mostly driven by a reduction in exclusions, while illness & lateness levels rise slightly: ```{r} #| label: plot before & after by reason code school attendance order breach #| fig-height: 4 before_after_year_fun(cohort_name_selection = "School Att Order Breach") |> ggplot(aes(x = name, y = mean, fill = prior_post, label = scales::percent(mean, accuracy = 0.01))) + geom_col(position = "dodge") + geom_errorbar(aes(ymax = upper_ci,ymin = lower_ci), width = 0.2, alpha = 0.5, position = position_dodge(width = 0.9), colour = "gray20") + geom_text(position = position_dodge(width = 0.9), vjust = -0.2, size = 3) + scale_fill_met_d("Egypt") + scale_colour_met_d("Egypt") + scale_y_continuous(labels = scales::percent) + barplottheme_minimal + labs(title = "Absence reasons before and after School Attendance Order Breach", subtitle = "Average % of sessions missed one year either side of involvement or episode start date", caption = "data from Capita One") + theme(plot.title = element_markdown(size = 12), legend.position = "none", eb, axis.text.y = eb) + scale_x_discrete(labels = function(x) str_wrap(x, width = 2)) ``` Finally, the stacked categorised bar chart shows how, despite the increase in _average_ attendance, the majority of children are absent from the attendance records entirely both before and after the attendance order breach involvement: ```{r} #| label: categorised stacked bar school attendance order breach # stacked bar of % category stack_data <- involvements |> filter(cohort_name == "School Att Order Breach") |> select(stud_id,starts_with("attend_cat")) |> pivot_longer(-stud_id,names_to = "index", values_to = "attend_cat") |> mutate(index = str_replace(index,"attend_cat_","")) |> mutate(index = str_replace(index,"start_","")) |> mutate(index = str_replace(index,"_"," ")) |> mutate(index = factor(index, levels = c("prior 12","prior 11","prior 10","prior 9","prior 8","prior 7","prior 6","prior 5","prior 4","prior 3","prior 2","prior 1","post 1","post 2","post 3","post 4","post 5","post 6","post 7","post 8","post 9","post 10","post 11","post 12"))) |> group_by(index, attend_cat) |> filter(attend_cat != "out of scope") |> #filter(attend_cat != "preschool") |> #filter(!attend_cat %in% c("out of scope","preschool","leaving age","covid")) |> tally() |> mutate(freq = n / sum(n)) |> filter(!is.na(index)) |> mutate(attend_cat = factor(attend_cat, levels = c("NA","0 - 50%","50 - 80%","80 - 90%","90 - 100%","preschool","leaving age","covid","out of scope"))) ggplot(stack_data, aes(x = index, y = freq, label = freq, fill = attend_cat, group = attend_cat)) + geom_col(position = position_stack(reverse = TRUE))+ #geom_text(position = position_stack(reverse = TRUE, vjust = 0.5), size = 2.5, angle = 90) + theme(axis.title.y = eb, legend.position = "right", axis.text.x = element_text(angle = 90, vjust = 0.5)) + scale_fill_manual(values = stack_colours, guide = guide_legend(reverse = TRUE)) + scale_y_continuous(labels = scales::percent) + labs(title = "Categorised attendance before & after School Attendance Order Breach", subtitle = "Percentage of pupils in each category", x = "half term index (relative to involvement start date)", fill = "") + geom_vline(xintercept = 12.5, linetype = "dashed", colour = "gray40") + annotate(geom = "text", label = str_wrap("inv start", width = 10), y = 1.05, x = 13.5, colour = "gray40", size = 2.5) + coord_cartesian(clip = "off") ``` ## GP protocol ```{r} #| label: GP Protocol volumes & age on first start ##| out-width: 75% ##| fig-height: 3 #| fig-align: "center" #| fig-ncol: 2 # plot volumes involvements |> filter(cohort_name == "GP Protocol", exam_year_inv_start >= 2010, exam_year_inv_start <= 2024) |> group_by(exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "GP Protocol volumes", subtitle = "involvements starting by accademic year; 2025 is part year only", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2025)) + coord_cartesian(clip = "off") gpp_stud_id <- involvements |> filter(cohort_name == "GP Protocol") |> select(stud_id) |> unique() gpp_age <- involvements |> filter(cohort_name == "GP Protocol") |> select(stud_id, cohort_name, open_date) |> left_join(attend |> filter(stud_id %in% gpp_stud_id$stud_id) |> select(stud_id, dob) |> unique(), by = "stud_id" ) |> mutate(age = floor(as.numeric( (open_date - dob)/365.25))) |> group_by(age) |> summarise(child_count = n_distinct(stud_id)) ggplot(gpp_age, aes(x = age, y = child_count, label = child_count)) + geom_col(fill = "steelblue") + geom_text(colour = "steelblue", size = 3, vjust = -1)+ theme(axis.text.y = eb, axis.title.y = eb, axis.line.y = eb, axis.ticks.y = eb) + labs(title = "Age on involvement start - GP Protocol", x = "age on involvement open date") #+ #scale_x_continuous(breaks = seq(0:16), limits = c(0,16)) ``` There is no apparent affect on attendance levels from this involvement: ```{r} #| label: plot GP Protocol avg before and after inv_summary_long_start |> filter(cohort_name == "GP Protocol") |> ggplot(aes(x = term_index, y = mean.percent_present ) ) + geom_point(colour = "steel blue", size = 2) + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Average attendance before and after GP Protocol involvement", subtitle = "Mean percentage of available sessions attended +- 95 CI", caption = "data from Capita One", x = "half term (relative to start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") ``` The categorised reason plot is not included here, as it shows no particular movement. However, although there is no change in overall _average_ attendance, there is significant movement on the stacked bar plot of attendance brackets. This happens immediately following the involvement, with growth in both the highest and lowest attendance brackets: ```{r} #| label: categorised stacked bar GP Protocol # stacked bar of % category stack_data <- involvements |> filter(cohort_name == "GP Protocol") |> select(stud_id,starts_with("attend_cat")) |> pivot_longer(-stud_id,names_to = "index", values_to = "attend_cat") |> mutate(index = str_replace(index,"attend_cat_","")) |> mutate(index = str_replace(index,"start_","")) |> mutate(index = str_replace(index,"_"," ")) |> mutate(index = factor(index, levels = c("prior 12","prior 11","prior 10","prior 9","prior 8","prior 7","prior 6","prior 5","prior 4","prior 3","prior 2","prior 1","post 1","post 2","post 3","post 4","post 5","post 6","post 7","post 8","post 9","post 10","post 11","post 12"))) |> group_by(index, attend_cat) |> filter(attend_cat != "out of scope") |> #filter(attend_cat != "preschool") |> #filter(!attend_cat %in% c("out of scope","preschool","leaving age","covid")) |> tally() |> mutate(freq = n / sum(n)) |> filter(!is.na(index)) |> mutate(attend_cat = factor(attend_cat, levels = c("NA","0 - 50%","50 - 80%","80 - 90%","90 - 100%","preschool","leaving age","covid","out of scope"))) ggplot(stack_data, aes(x = index, y = freq, label = freq, fill = attend_cat, group = attend_cat)) + geom_col(position = position_stack(reverse = TRUE))+ #geom_text(position = position_stack(reverse = TRUE, vjust = 0.5), size = 2.5, angle = 90) + theme(axis.title.y = eb, legend.position = "right", axis.text.x = element_text(angle = 90, vjust = 0.5)) + scale_fill_manual(values = stack_colours, guide = guide_legend(reverse = TRUE)) + scale_y_continuous(labels = scales::percent) + labs(title = "Categorised attendance before & after GP Protocol", subtitle = "Percentage of pupils in each category", x = "half term index (relative to involvement start date)", fill = "") + geom_vline(xintercept = 12.5, linetype = "dashed", colour = "gray40") + annotate(geom = "text", label = str_wrap("inv start", width = 10), y = 1.05, x = 13.5, colour = "gray40", size = 2.5) + coord_cartesian(clip = "off") ``` ## Nurture There are five different involvement types here all under the banner of 'Nurture'. It looks like these cover different sites in the city, but the assumption is that they are all broadly the same service and so are covered here together. Those 5 involvement types are: Nurture - BLT Hinde House Nurture - BLT Yewlands Nurture - Bumblebee Nurture - BLT Earl Marshall Nurture - Step Out Volumes are quite low and except for Bumblebee much of the activity is very recent: ```{r} #| label: nurture type involvement volumes ##| out-width: 75% ##| fig-height: 3 #| fig-align: "center" #| fig-ncol: 2 nurture_invs <- c("Nurture - BLT Hinde House","Nurture - BLT Yewlands", "Nurture - Bumblebee", "Nurture - BLT Earl Marshall", "Nurture - Step Out") # plot volumes involvements |> filter(cohort_name %in% nurture_invs, exam_year_inv_start >= 2010, exam_year_inv_start <= 2024) |> group_by(cohort_name, exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "Nurture involvement volumes", subtitle = "involvements starting by accademic year; 2025 is part year only", caption = "data from Capita One") + facet_wrap(vars(cohort_name)) + theme(axis.title.x = eb, axis.text.y = eb, strip.background = eb, axis.text.x = element_text(size = 6.5)) + scale_x_continuous(breaks = seq(2010,2025)) + coord_cartesian(clip = "off") ``` There are differences in the age profile of the different nurture type involvements. ```{r} #| label: nurture type involvement age profile nurture_stud_id <- involvements |> filter(cohort_name %in% nurture_invs) |> select(stud_id) |> unique() nurture_age <- involvements |> filter(cohort_name %in% nurture_invs) |> select(stud_id, cohort_name, open_date) |> left_join(attend |> filter(stud_id %in% nurture_stud_id$stud_id) |> select(stud_id, dob) |> unique(), by = "stud_id" ) |> mutate(age = floor(as.numeric( (open_date - dob)/365.25))) |> group_by(cohort_name, age) |> summarise(child_count = n_distinct(stud_id)) ggplot(nurture_age, aes(x = age, y = child_count, label = child_count, fill = cohort_name)) + geom_col() + theme(axis.title.y = eb, axis.line.y = eb, axis.ticks.y = eb, legend.title = eb) + labs(title = "Nurture type involvements - count by age on involvement start", x = "age on involvement open date") ``` Nurture type involvements show continued low attendance after involvment - but with some signs of improvement over time: ```{r} #| label: plot nurture avg before and after inv_group_summary_long_start |> filter(involvement_group == "nurture") |> ggplot(aes(x = term_index, y = mean.percent_present ) ) + geom_point(colour = "steel blue", size = 2) + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Average attendance before and after Nurture type involvements", subtitle = "Mean percentage of available sessions attended +- 95 CI", caption = "data from Capita One", x = "half term (relative to start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") ``` Nurture type involvements show and immediate and dramatic reduction in exclusion rates: ```{r} #| label: plot Nurture exclusions #| fig-height: 3.5 inv_group_summary_long_start |> filter(involvement_group == "nurture") |> ggplot(aes(x = term_index, y = mean.percent_excluded) ) + geom_point(colour = "dark red") + geom_errorbar(aes(ymin = lower.ci.percent_excluded, ymax = upper.ci.percent_excluded), width = 0.2, colour = "dark red")+ labs(title = "Exclusion before and after Nurture type involvements", subtitle = "Average percentage of available sessions where the child was excluded +- 95 CI", x = "half term (relative to involvement start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") + geom_hline(yintercept = excl_2023_avg, linetype = "dashed", colour = "dark grey") + geom_text(aes(x = 4, label = paste("2023 avg",scales::percent(excl_2023_avg, accuracy = 0.1L)), y = excl_2023_avg), size = 2.5, vjust = 1, colour = "dark grey") ``` And returning to attendance, the stacked bar plot of attendance brackets is more revealing than the average attendance levels. Preschool half term periods and some COVID periods have been removed here. Although the overall average doesn't change, the involvment start date sees a big reduction in the worst attendance brackets (0 - 50%) and growth in both the highest (90% +) and middle attendance brackets (50 - 80%). ```{r} #| label: categorised stacked bar nurture # stacked bar of % category stack_data <- involvements |> filter(cohort_name %in% nurture_invs) |> select(stud_id,starts_with("attend_cat")) |> pivot_longer(-stud_id,names_to = "index", values_to = "attend_cat") |> mutate(index = str_replace(index,"attend_cat_","")) |> mutate(index = str_replace(index,"start_","")) |> mutate(index = str_replace(index,"_"," ")) |> mutate(index = factor(index, levels = c("prior 12","prior 11","prior 10","prior 9","prior 8","prior 7","prior 6","prior 5","prior 4","prior 3","prior 2","prior 1","post 1","post 2","post 3","post 4","post 5","post 6","post 7","post 8","post 9","post 10","post 11","post 12"))) |> group_by(index, attend_cat) |> filter(attend_cat != "out of scope") |> #filter(attend_cat != "preschool") |> filter(!attend_cat %in% c("out of scope","preschool","leaving age","covid")) |> tally() |> mutate(freq = n / sum(n)) |> filter(!is.na(index)) |> mutate(attend_cat = factor(attend_cat, levels = c("NA","0 - 50%","50 - 80%","80 - 90%","90 - 100%","preschool","leaving age","covid","out of scope"))) ggplot(stack_data, aes(x = index, y = freq, label = freq, fill = attend_cat, group = attend_cat)) + geom_col(position = position_stack(reverse = TRUE))+ #geom_text(position = position_stack(reverse = TRUE, vjust = 0.5), size = 2.5, angle = 90) + theme(axis.title.y = eb, legend.position = "right", axis.text.x = element_text(angle = 90, vjust = 0.5)) + scale_fill_manual(values = stack_colours, guide = guide_legend(reverse = TRUE)) + scale_y_continuous(labels = scales::percent) + labs(title = "Categorised attendance before & after Nurture type involvements", subtitle = "Percentage of pupils in each category", x = "half term index (relative to involvement start date)", fill = "") + geom_vline(xintercept = 12.5, linetype = "dashed", colour = "gray40") + annotate(geom = "text", label = str_wrap("inv start", width = 10), y = 1.05, x = 13.5, colour = "gray40", size = 2.5) + coord_cartesian(clip = "off") ``` ```{r} #| label: test plot involvement length #| eval: false # length of involvement? involvements |> filter(cohort_name %in% nurture_invs, #== "Attendance Advice", !is.na(close_date)) |> select(stud_id, start = full_period_sequence_inv_start, close = full_period_sequence_inv_close) |> mutate(halfterms_open = close - start + 1, stud_id = factor(stud_id)) |> ggplot( aes( x = reorder(stud_id, halfterms_open), y = halfterms_open) ) + geom_col() + labs(title = "TEST length of involvement - nurture") #check <- colnames(involvements) |> as.data.frame() ``` ## Managed moves The volumes available for managed moves are very low: ```{r} #| label: managed move volumes #| out-width: 75% #| fig-align: "center" involvements |> filter(cohort_name == "Managed Move") |> group_by(exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -0.8, size = 3) + barplottheme_minimal + labs(title = "Managed Move volumes", subtitle = "involvements starting by accademic year", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb) + scale_x_continuous(breaks = seq(2010,2024)) + coord_cartesian(clip = "off") ``` Because the volumes are so small, the confidence intervals here are wide. Managed Moves are associated with a continuing decline in attendance levels: ```{r} #| label: plot managed move average attendance before and after #| fig-height: 3.5 inv_summary_long_start |> filter(cohort_name == "Managed Move") |> ggplot(aes(x = term_index, y = mean.percent_present ) ) + geom_point(colour = "steel blue", size = 2) + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Average attendance before and after Managed Move", subtitle = "Mean percentage of available sessions attended +- 95 CI", caption = "data from Capita One", x = "half term (relative to start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") ``` Looking at coded reasons before & after a managed move, the overall increase in absence seems to be driven by a an increase in 'no reason' absences: ```{r} #| label: plot before & after by reason code for managed move cohort_name_selection <- "Managed Move" mean <- inv_summary |> filter(cohort_name == cohort_name_selection) |> ungroup() |> select(contains("mean")) |> select(contains("year")) |> pivot_longer(everything(), values_to = "mean", names_to = "name") |> filter(!str_detect(name, "close")) |> filter(!str_detect(name,"present")) |> mutate(prior_post = if_else(str_detect(name, "prior") == TRUE, "prior","post")) |> mutate(prior_post = factor(prior_post, levels = c("prior","post"))) |> mutate(name = str_replace(name, "mean_percent_","")) |> mutate(name = str_replace(name, "_prior","")) |> mutate(name = str_replace(name, "_post","")) |> mutate(name = str_replace(name, "_start","")) |> mutate(name = str_replace(name, "_year","")) |> mutate(name = case_when(name == "late_pres" ~ "late present", TRUE ~ str_replace(name,"_"," "))) #|> upper_ci <- inv_summary |> filter(cohort_name == cohort_name_selection) |> ungroup() |> select(contains("upper_ci")) |> select(contains("year")) |> pivot_longer(everything(), values_to = "upper_ci", names_to = "name") |> filter(!str_detect(name, "close")) |> filter(!str_detect(name,"present")) |> mutate(prior_post = if_else(str_detect(name, "prior") == TRUE, "prior","post")) |> mutate(prior_post = factor(prior_post, levels = c("prior","post"))) |> mutate(name = str_replace(name, "upper_ci_percent_","")) |> mutate(name = str_replace(name, "_prior","")) |> mutate(name = str_replace(name, "_post","")) |> mutate(name = str_replace(name, "_start","")) |> mutate(name = str_replace(name, "_year","")) |> mutate(name = case_when(name == "late_pres" ~ "late present", TRUE ~ str_replace(name,"_"," "))) lower_ci <- inv_summary |> filter(cohort_name == cohort_name_selection) |> ungroup() |> select(contains("lower_ci")) |> select(contains("year")) |> pivot_longer(everything(), values_to = "lower_ci", names_to = "name") |> filter(!str_detect(name, "close")) |> filter(!str_detect(name,"present")) |> mutate(prior_post = if_else(str_detect(name, "prior") == TRUE, "prior","post")) |> mutate(prior_post = factor(prior_post, levels = c("prior","post"))) |> mutate(name = str_replace(name, "lower_ci_percent_","")) |> mutate(name = str_replace(name, "_prior","")) |> mutate(name = str_replace(name, "_post","")) |> mutate(name = str_replace(name, "_start","")) |> mutate(name = str_replace(name, "_year","")) |> mutate(name = case_when(name == "late_pres" ~ "late present", TRUE ~ str_replace(name,"_"," "))) # join and plot left_join(mean,upper_ci, by = c("name","prior_post")) |> left_join(lower_ci, by = c("name","prior_post")) |> ggplot(aes(x = name, y = mean, fill = prior_post, label = scales::percent(mean, accuracy = 0.01))) + geom_col(position = "dodge") + geom_errorbar(aes(ymax = upper_ci,ymin = lower_ci), width = 0.2, alpha = 0.5, position = position_dodge(width = 0.9), colour = "gray20") + geom_text(position = position_dodge(width = 0.9), vjust = -0.2, size = 3) + scale_fill_met_d("Egypt") + scale_colour_met_d("Egypt") + scale_y_continuous(labels = scales::percent) + barplottheme_minimal + labs(title = "Absence reasons before and after Managed Move", subtitle = "Average % of sessions missed one year either side of involvement or episode start date", caption = "data from Capita One") + theme(plot.title = element_markdown(size = 12), legend.position = "none", axis.title = eb, axis.text.y = eb) ``` # Child social care There are three levels of child social care: Child in Need (CIN) Child Protection Plan (CPP) Child Looked After (CLA) Although these are not involvements in the same sense as the other services & interventions, they have been treated as such in the data so that the start date can be used a zero point to compare attendance levels before and after the episode starts. Unlike the other involvement types, which have been taken in order of volume, we'll look at CIN, CPP and CLA together ### Child Social Care Volumes There are around 5000 new CIN episodes start each year. ```{r} #| label: plot child in need volumes #| out-width: 75% #| fig-align: "center" involvements |> filter(cohort_name %in% c("CIN","CPP","CLA"), exam_year_inv_start >= 2010, exam_year_inv_close <= 2024) |> group_by(cohort_name, exam_year_inv_start) |> summarise(child_count = n_distinct(stud_id)) |> ggplot(aes(x = exam_year_inv_start, y = child_count, label = child_count)) + geom_col(width = 0.8, fill = "steel blue") + geom_text(colour = "steel blue", vjust = -1, size = 3) + barplottheme_minimal + facet_grid(rows = vars(cohort_name), scales = "free_y", switch = "y") + labs(title = "Child Social Care volumes", subtitle = "episodes starting by accademic year", caption = "data from Capita One") + theme(axis.title.x = eb, axis.text.y = eb, strip.background = eb, strip.placement = "outside", panel.spacing.y = unit(2, "lines"), strip.text.y.left = element_text(angle=0, vjust=1)) + scale_x_continuous(breaks = seq(2010,2024)) + coord_cartesian(clip = "off") # (GR) note - to get the strip placement to work above, I've taken some theme elements from here: # https://stackoverflow.com/questions/62356184/how-to-place-the-strip-in-facet-grid-on-top-of-plots-when-plotting-graphs-vertic ``` ## Child In Need - effects on attendance Prior to a CIN episode we see declining attendance levels. The CIN episode is associated with a turnaround in the average direction of travel, but attendance levels remain very low for the following two years. It's worth noting that the average attendance level for all these children is below 90% for the entire four year window of time we're looking at here. ```{r} #| label: plot CIN average attendance before and after inv_summary_long_start |> filter(cohort_name == "CIN") |> ggplot(aes(x = term_index, y = mean.percent_present ) ) + geom_point(colour = "steel blue", size = 2) + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Average attendance before and after Child In Need episode start", subtitle = "Mean percentage of available sessions attended +- 95 CI", caption = "data from Capita One", x = "half term (relative to start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") ``` CIN episodes show an increase in overall absence in the year following episode start date. Most reason codes stay the same, with a small increase in "no reason" absences: ```{r} #| label: plot before & after by reason code CIN #| fig-height: 4 before_after_year_fun(cohort_name_selection = "CIN") |> ggplot(aes(x = name, y = mean, fill = prior_post, label = scales::percent(mean, accuracy = 0.01))) + geom_col(position = "dodge") + geom_errorbar(aes(ymax = upper_ci,ymin = lower_ci), width = 0.2, alpha = 0.5, position = position_dodge(width = 0.9), colour = "gray20") + geom_text(position = position_dodge(width = 0.9), vjust = -0.2, size = 3) + scale_fill_met_d("Egypt") + scale_colour_met_d("Egypt") + scale_y_continuous(labels = scales::percent) + barplottheme_minimal + labs(title = "Absence reasons before and after Child In Need episode", subtitle = "Average % of sessions missed one year either side of involvement or episode start date", caption = "data from Capita One") + theme(plot.title = element_markdown(size = 12), legend.position = "none", axis.title = eb, axis.text.y = eb) + scale_x_discrete(labels = function(x) str_wrap(x, width = 2)) ``` Before we move on to the CPP episodes, here is the stacked bar plot of bracketed attendance. First we'll include all categories, since it illustrates how many children with CIN episodes are of pre-school age both before and after the social care start date. ```{r} #| label: categorised stacked bar CIN all # stacked bar of % category stack_data <- involvements |> filter(cohort_name == "CIN") |> select(stud_id,starts_with("attend_cat")) |> pivot_longer(-stud_id,names_to = "index", values_to = "attend_cat") |> mutate(index = str_replace(index,"attend_cat_","")) |> mutate(index = str_replace(index,"start_","")) |> mutate(index = str_replace(index,"_"," ")) |> mutate(index = factor(index, levels = c("prior 12","prior 11","prior 10","prior 9","prior 8","prior 7","prior 6","prior 5","prior 4","prior 3","prior 2","prior 1","post 1","post 2","post 3","post 4","post 5","post 6","post 7","post 8","post 9","post 10","post 11","post 12"))) |> group_by(index, attend_cat) |> filter(attend_cat != "out of scope") |> #filter(attend_cat != "preschool") |> #filter(!attend_cat %in% c("out of scope","preschool","leaving age","covid")) |> tally() |> mutate(freq = n / sum(n)) |> filter(!is.na(index)) |> mutate(attend_cat = factor(attend_cat, levels = c("NA","0 - 50%","50 - 80%","80 - 90%","90 - 100%","preschool","leaving age","covid","out of scope"))) ggplot(stack_data, aes(x = index, y = freq, label = freq, fill = attend_cat, group = attend_cat)) + geom_col(position = position_stack(reverse = TRUE))+ #geom_text(position = position_stack(reverse = TRUE, vjust = 0.5), size = 2.5, angle = 90) + theme(axis.title.y = eb, legend.position = "right", axis.text.x = element_text(angle = 90, vjust = 0.5)) + scale_fill_manual(values = stack_colours, guide = guide_legend(reverse = TRUE)) + scale_y_continuous(labels = scales::percent) + labs(title = "Categorised attendance before & after Child In Need episode", subtitle = "Percentage of pupils in each category", x = "half term index (relative to involvement start date)", fill = "") + geom_vline(xintercept = 12.5, linetype = "dashed", colour = "gray40") + annotate(geom = "text", label = str_wrap("inv start", width = 10), y = 1.05, x = 13.5, colour = "gray40", size = 2.5) + coord_cartesian(clip = "off") ``` Secondly, here is the plot with the preschool, leaving age and COVID periods removed, so that we can focus on the changing patterns of attendance: ```{r} #| label: categorised stacked bar CIN attendance only # stacked bar of % category stack_data <- involvements |> filter(cohort_name == "CIN") |> select(stud_id,starts_with("attend_cat")) |> pivot_longer(-stud_id,names_to = "index", values_to = "attend_cat") |> mutate(index = str_replace(index,"attend_cat_","")) |> mutate(index = str_replace(index,"start_","")) |> mutate(index = str_replace(index,"_"," ")) |> mutate(index = factor(index, levels = c("prior 12","prior 11","prior 10","prior 9","prior 8","prior 7","prior 6","prior 5","prior 4","prior 3","prior 2","prior 1","post 1","post 2","post 3","post 4","post 5","post 6","post 7","post 8","post 9","post 10","post 11","post 12"))) |> group_by(index, attend_cat) |> filter(!attend_cat %in% c("out of scope","preschool","leaving age","covid")) |> tally() |> mutate(freq = n / sum(n)) |> filter(!is.na(index)) |> mutate(attend_cat = factor(attend_cat, levels = c("NA","0 - 50%","50 - 80%","80 - 90%","90 - 100%","preschool","leaving age","covid","out of scope"))) ggplot(stack_data, aes(x = index, y = freq, label = freq, fill = attend_cat, group = attend_cat)) + geom_col(position = position_stack(reverse = TRUE))+ #geom_text(position = position_stack(reverse = TRUE, vjust = 0.5), size = 2.5, angle = 90) + theme(axis.title.y = eb, legend.position = "right", axis.text.x = element_text(angle = 90, vjust = 0.5)) + scale_fill_manual(values = stack_colours, guide = guide_legend(reverse = TRUE)) + scale_y_continuous(labels = scales::percent) + labs(title = "Categorised attendance before & after Child In Need episode", subtitle = "Percentage of pupils in each category", x = "half term index (relative to involvement start date)", fill = "") + geom_vline(xintercept = 12.5, linetype = "dashed", colour = "gray40") + annotate(geom = "text", label = str_wrap("inv start", width = 10), y = 1.05, x = 13.5, colour = "gray40", size = 2.5) + coord_cartesian(clip = "off") ``` ## Child Protection Plan - effects on attendance Child Protection Plan episodes are associated with a more significant turnaround in attendance levels: ```{r} #| label: plot CPP average attendance before and after inv_summary_long_start |> filter(cohort_name == "CPP") |> ggplot(aes(x = term_index, y = mean.percent_present ) ) + geom_point(colour = "steel blue", size = 2) + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Average attendance before and after Child Protection Plan episode start", subtitle = "Mean percentage of available sessions attended +- 95 CI", caption = "data from Capita One", x = "half term (relative to start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") ``` CPP episodes are associated with decreases in all coded reasons. These decreases are much more apparent at the half term level than the annual level: ```{r} #| label: plot before & after by reason code CPP #| fig-height: 6 #| fig-nrow: 2 before_after_ht_fun(cohort_name_selection = "CPP") |> ggplot(aes(x = name, y = mean, fill = prior_post, label = scales::percent(mean, accuracy = 0.01))) + geom_col(position = "dodge") + geom_errorbar(aes(ymax = upper_ci,ymin = lower_ci), width = 0.2, alpha = 0.5, position = position_dodge(width = 0.9), colour = "gray20") + geom_text(position = position_dodge(width = 0.9), vjust = -0.2, size = 3) + scale_fill_met_d("Egypt") + scale_colour_met_d("Egypt") + scale_y_continuous(labels = scales::percent) + barplottheme_minimal + labs(title = "Absence reasons in the half term before and after Child Protection Plan episode", subtitle = "Average % of sessions missed one half term either side of involvement or episode start date", caption = "data from Capita One") + theme(plot.title = element_markdown(size = 12), legend.position = "none", axis.title = eb, axis.text.y = eb) + scale_x_discrete(labels = function(x) str_wrap(x, width = 2)) # one year before & after before_after_year_fun(cohort_name_selection = "CPP") |> ggplot(aes(x = name, y = mean, fill = prior_post, label = scales::percent(mean, accuracy = 0.01))) + geom_col(position = "dodge") + geom_errorbar(aes(ymax = upper_ci,ymin = lower_ci), width = 0.2, alpha = 0.5, position = position_dodge(width = 0.9), colour = "gray20") + geom_text(position = position_dodge(width = 0.9), vjust = -0.2, size = 3) + scale_fill_met_d("Egypt") + scale_colour_met_d("Egypt") + scale_y_continuous(labels = scales::percent) + barplottheme_minimal + labs(title = "Absence reasons in the yearbefore and after Child Protection Plan episode", subtitle = "Average % of sessions missed one year either side of involvement or episode start date", caption = "data from Capita One") + theme(plot.title = element_markdown(size = 12), legend.position = "none", axis.title = eb, axis.text.y = eb) + scale_x_discrete(labels = function(x) str_wrap(x, width = 2)) ``` Here is the categorised stacked bar plot for Child Protection Plans, showing an immediate and sustained increase in the number of children with the highest levels of attendance: ```{r} #| label: categorised stacked bar CPP # stacked bar of % category stack_data <- involvements |> filter(cohort_name == "CPP") |> select(stud_id,starts_with("attend_cat")) |> pivot_longer(-stud_id,names_to = "index", values_to = "attend_cat") |> mutate(index = str_replace(index,"attend_cat_","")) |> mutate(index = str_replace(index,"start_","")) |> mutate(index = str_replace(index,"_"," ")) |> mutate(index = factor(index, levels = c("prior 12","prior 11","prior 10","prior 9","prior 8","prior 7","prior 6","prior 5","prior 4","prior 3","prior 2","prior 1","post 1","post 2","post 3","post 4","post 5","post 6","post 7","post 8","post 9","post 10","post 11","post 12"))) |> group_by(index, attend_cat) |> filter(attend_cat != "out of scope") |> filter(!attend_cat %in% c("out of scope","preschool","leaving age","covid")) |> tally() |> mutate(freq = n / sum(n)) |> filter(!is.na(index)) |> mutate(attend_cat = factor(attend_cat, levels = c("NA","0 - 50%","50 - 80%","80 - 90%","90 - 100%","preschool","leaving age","covid","out of scope"))) ggplot(stack_data, aes(x = index, y = freq, label = freq, fill = attend_cat, group = attend_cat)) + geom_col(position = position_stack(reverse = TRUE))+ #geom_text(position = position_stack(reverse = TRUE, vjust = 0.5), size = 2.5, angle = 90) + theme(axis.title.y = eb, legend.position = "right", axis.text.x = element_text(angle = 90, vjust = 0.5)) + scale_fill_manual(values = stack_colours, guide = guide_legend(reverse = TRUE)) + scale_y_continuous(labels = scales::percent) + labs(title = "Categorised attendance before & after Child Protection Plan", subtitle = "Percentage of pupils in each category", x = "half term index (relative to episode start date)", fill = "") + geom_vline(xintercept = 12.5, linetype = "dashed", colour = "gray40") + annotate(geom = "text", label = str_wrap("inv start", width = 10), y = 1.05, x = 13.5, colour = "gray40", size = 2.5) + coord_cartesian(clip = "off") ``` ## Child Looked After - effects on attendance CLA episodes show the most dramatic change in attendance levels of anything we've covered in this report: ```{r} #| label: plot CLA average attendance before and after inv_summary_long_start |> filter(cohort_name == "CLA") |> ggplot(aes(x = term_index, y = mean.percent_present ) ) + geom_point(colour = "steel blue", size = 2) + geom_errorbar(aes(ymin = lower.ci.percent_present, ymax = upper.ci.percent_present), width = 0.2, colour = "steel blue")+ labs(title = "Average attendance before and after Child Looked After episode start", subtitle = "Mean percentage of available sessions attended +- 95 CI", caption = "data from Capita One", x = "half term (relative to start date)")+ barplottheme_minimal + scale_y_continuous(labels = scales::percent) + scale_x_continuous(breaks = seq(-12,12)) + theme(axis.text.x = element_text(size = 8), legend.position = "none") ``` CLA episodes show a significant decrease across all all coded absence reasons except family holidays (which is less of an issue for this cohort in any case): ```{r} #| label: plot before & after by reason code for CLA before_after_year_fun(cohort_name_selection = "CLA") |> ggplot(aes(x = name, y = mean, fill = prior_post, label = scales::percent(mean, accuracy = 0.01))) + geom_col(position = "dodge") + geom_errorbar(aes(ymax = upper_ci,ymin = lower_ci), width = 0.2, alpha = 0.5, position = position_dodge(width = 0.9), colour = "gray20") + geom_text(position = position_dodge(width = 0.9), vjust = -0.2, size = 3) + scale_fill_met_d("Egypt") + scale_colour_met_d("Egypt") + scale_y_continuous(labels = scales::percent) + barplottheme_minimal + labs(title = "Absence reasons before and after Child Looked After episode", subtitle = "Average % of sessions missed one year either side of involvement or episode start date", caption = "data from Capita One") + theme(plot.title = element_markdown(size = 12), legend.position = "none", axis.title = eb, axis.text.y = eb) + scale_x_discrete(labels = function(x) str_wrap(x, width = 2)) ``` Finally here is the stacked categorised attendance plot for Child Looked After episodes, showing an immediate and sustained increase in the proprtion of children with the highest levels of attendance: ```{r} #| label: categorised stacked bar CLA # stacked bar of % category stack_data <- involvements |> filter(cohort_name == "CLA") |> select(stud_id,starts_with("attend_cat")) |> pivot_longer(-stud_id,names_to = "index", values_to = "attend_cat") |> mutate(index = str_replace(index,"attend_cat_","")) |> mutate(index = str_replace(index,"start_","")) |> mutate(index = str_replace(index,"_"," ")) |> mutate(index = factor(index, levels = c("prior 12","prior 11","prior 10","prior 9","prior 8","prior 7","prior 6","prior 5","prior 4","prior 3","prior 2","prior 1","post 1","post 2","post 3","post 4","post 5","post 6","post 7","post 8","post 9","post 10","post 11","post 12"))) |> group_by(index, attend_cat) |> filter(attend_cat != "out of scope") |> #filter(attend_cat != "preschool") |> filter(!attend_cat %in% c("out of scope","preschool","leaving age","covid")) |> tally() |> mutate(freq = n / sum(n)) |> filter(!is.na(index)) |> mutate(attend_cat = factor(attend_cat, levels = c("NA","0 - 50%","50 - 80%","80 - 90%","90 - 100%","preschool","leaving age","covid","out of scope"))) ggplot(stack_data, aes(x = index, y = freq, label = freq, fill = attend_cat, group = attend_cat)) + geom_col(position = position_stack(reverse = TRUE))+ #geom_text(position = position_stack(reverse = TRUE, vjust = 0.5), size = 2.5, angle = 90) + theme(axis.title.y = eb, legend.position = "right", axis.text.x = element_text(angle = 90, vjust = 0.5)) + scale_fill_manual(values = stack_colours, guide = guide_legend(reverse = TRUE)) + scale_y_continuous(labels = scales::percent) + labs(title = "Categorised attendance before & after Child Looked After episode", subtitle = "Percentage of pupils in each category", x = "half term index (relative to involvement start date)", fill = "") + geom_vline(xintercept = 12.5, linetype = "dashed", colour = "gray40") + annotate(geom = "text", label = str_wrap("inv start", width = 10), y = 1.05, x = 13.5, colour = "gray40", size = 2.5) + coord_cartesian(clip = "off") ``` ## Conclusions This is the end of this report. The picture in terms of effectiveness is very mixed, though a few themes have emerged: - Most interventions see a change in overall direction of change in attendance, but rarely do we see a net gain - Some interventions are effective on exclusion rates, but not on overall attendance. - Interventions are more effective on younger children. - There is some evidence that girls have better responses than boys. - Children's Social Care episodes are associated with particularly strong improvements in attendance. - The Family Intervention Service is associated with an immediate improvement, and sustained improvement follows. The very high volumes of this service, and relatively low attendance of children in the run up to an episode of FIS, together suggest an opportunity. Finally, we should restate that where we have found no evidence of a direct increase in attendance levels following intervention we must emphasise that this is not evidence that such interventions are failing or not worth doing - there may be benefits beyond what is visible in the attendance data, and the likelihood is that these interventions are having a limiting effect on what would otherwise be an even worse situation.