Empirical evaluation of home-based reablement: A review

Home-based reablement (HBR) aims to restore or increase patients’ level of functioning, thereby increasing the patients’ self-reliance and consequently decreasing their dependence on healthcare services. To date, the evidence on whether HBR is an efficient method has not been comprehensively reviewed. The aim of this study was to provide a concise summary of relevant existing findings. In addition, we provide a critical constructive assessment of the publications reflecting the extant research. The relevant literature on this topic was identified through a systematic search of appropriate databases. Thereafter, we screened the studies, first by title, followed by abstract and then by assessing full-text eligibility. A checklist of 15 criteria was developed and used as the basis for the quality assessment. In total, 12 studies from Australia, New Zealand, the USA and Norway were included in the full-text review. The studies reported estimated cost differences between HBR and usual care after the intervention. All the studies indicated lower costs for HBR, but not all of them reported a significant difference. The same pattern was also found for other measures of physical functioning and quality of life. The assessment revealed one specific common pattern: None of the papers scrutinized provided sufficient information about the data or the statistics employed, and all lacked external validity. Some promising results have been reported with respect to HBR reducing the need for specialist or residential care. In short, the existing evidence regarding the effects of HBR is still inconclusive. The findings from the quality assessment should motivate a multidisciplinary approach for future research on HBR. JEL classification: I19, C18


Introduction
The western world is facing a significant demographic change in coming years. These forthcoming developments are expected to lead to a persisting change in the age distribution of the population. As the elderly population grows, the number of individuals facing agerelated diseases and multimorbidity will increase (Barnett et al., 2012). Costs of healthcare services increase with age and with the degree of multimorbidity (Yoon et al., 2014). According to Martins and de la Maisonneuve (2006), long-term care cost for people over 65 years old are predicted to double or triple by 2050 in countries belonging to the Organization for Economic Co-operation and Development (OECD). Along with these upcoming challenges, the number of participants in the workforce per senior citizen in OECD countries will decrease (OECD, 2017).
The upcoming challenges will increase the demand for long-term services as homebased care (Ryburn et al., 2009). Since home-based care is more cost-effective, many highincome countries are actively bolstering a shift from residential care to home-based care as a potentially more financially sustainable approach (Cochrane et al., 2013). Another incentive for this shift is that a majority of older people prefer to 'age in place' (Wiles et al., 2012). The forthcoming challenges will force healthcare to focus more on preventive measures, early intervention, new technology, rehabilitation and healthcare services that are less manpower-intensive, and services that empower senior citizens to self-manage chronic diseases (Europe, 2012).
Home-based reablement (HBR), known as restorative care in Australia, the USA and New Zealand, is one fairly new way of providing homecare services. The main goal of HBR is to restore or increase patients' level of functioning, thereby increasing the patients' self-reliance and consequently decreasing their dependence on healthcare services. Even though HBR is not a standardized treatment and vary in content, all such interventions have the same goal (Whitehead et al., 2015;Tuntland et al., 2014). This type of intervention has gained significant prominence internationally in recent years (Cochrane et al., 2016). The main features of being time-limited, multidisciplinary, home-based, goal-oriented, and person-centred are homogenous across HBR programmes. Patients are mainly senior citizens with or at risk of functional decline (Aspinal et al., 2016). Typically, a multidisciplinary team works towards a patient-defined goal concerning everyday activities important to the patient (Tuntland et al., 2014). HBR is not to be confused with "standard" rehabilitation or home-based rehabilitation. The latter is often medical directed an occurs in hospital or ambulatory setting. In addition, rehabilitation is usually provided after an acute event, HBR often follows a gradual decline and can be applied in a preventive manner (Metzelthin et al., 2020). A Danish study concluded that policy-makers mainly motivated by economic considerations were pivotal for the implementation of HBR (Fersch, 2015). High-quality care is clearly an essential goal in health care services, but future resources are limited, inevitably leading to priority setting and trade-offs (Emmert et al., 2012). Assessing the efficiency and effects of new interventions, including HBR, is therefore crucial.
To our knowledge, there are few comprehensive reviews of research related to the effects of HBR. No HBR studies were included in an overview of systematic reviews on economic evaluations of rehabilitation (Howard-Wilsher et al., 2016). Five HBR studies were included in a systematic review identifying interventions that aimed to reduce dependency in activities of daily living (ADL) (Whitehead et al., 2015). The two studies most similar to our paper are those by Tessier et al. (2016) and Legg et al. (2016), both systematic reviews from 2016. Tessier et al. (2016) examined the effectiveness of HBR and factors that might contribute to successful implementation for Canadian policy makers. They focused on three outcomes; function, health-related quality of life (HRQoL) and service utilization, concluding that there is good evidence supporting the effectiveness of HBR, especially regarding HRQoL and service utilization. Interestingly, Legg et al. (2016) studied whether publicly funded HBR affected patient health or use of services. They found no data suitable for evaluating the effects of HBR and concluded that there is no evidence that HBR fulfils its goals. In sum, previous reviews either focus on minor aspects of potential benefits of HBR alone or do not include studies on HBR, as such studies failed to meet the inclusion criteria defined by the respective reviewers. It is therefore the objective of this paper to provide a comprehensive review of current literature assessing HBR through empirical evaluation. First, we aim to provide a concise summary of relevant existing findings generated in the course of the research process. In addition, we provide a critical constructive assessment of the publications reflecting the extant research. The application of statistical concepts and models plays a central role in the research efforts we analysed. Consequently, our review adopts a dual perspective: the health-economic angle is augmented by a pronounced statistical/econometric viewpoint.
The remainder of the paper is organized as follows: In Section 2, we outline the methodological basis for this review. The main findings from relevant HBR research and the results of our assessment are presented in Section 3. Section 4 provides a thorough discussion of the results. Concluding remarks in Section 5 finalize the paper.

2.1
Search strategy We designed and implemented a sufficiently sensitive search and selection strategy, relaying on the expertise of an experienced librarian. Given the intrinsically multidisciplinary nature of HBR, we needed to extend our search to multiple databases covering the fields of medicine, health care, social work and economics. Thus, the search algorithms were applied in the databases Scopus, EBSCOhost, CINAHL Plus (with full text), MEDLINE, Academic Search Complete, SocINDEX, Social Work Abstracts, Business Source Complete and Econlit. The development of the search syntax reflects the terminological uncertainty concerning HBR as well as our goal to allow for the location of publications that assess the economic dimension of the care strategy studied. The search results discussed below are based on the string "((reablement OR re-ablement) OR (restorative W/3 (home OR care))) AND (economic* OR cost* OR evaluation*)", where the sub-command "restorative W/3 (home OR care)" indicated that we were looking for instances in which either the term "home" or the term "care" can be found within a threeword-neighbourhood of the term "restorative". No search filters were applied, and the same search string was used on all databases. The initial search was performed on 2016-03-17. It resulted in a total of 605 records. Consecutive updates were run on 2016-08-03, 2017-11-15 and 2019-09-04. All databases were searched on the same search date and detailed search string can be found in supplementary material section S1. Figure 1 shows the main steps of our sequential search and selection process.

2.2
Eligibility criteria Our work was guided by a predefined list of inclusion and exclusion criteria. A study qualified for inclusion if it (i-1) contained at least a partial evaluation on some quantifiable economic measure, both direct and indirect, of HBR, i.e., concepts like "effectiveness", "benefits" and "costs" of the treatment were considered, and (i-2) was published in a peerreviewed academic journal. We agreed to exclude studies of reablement (e-1) closely linked to dental health or paediatrics or (e-2) provided by and in hospitals or nursing homes. Moreover, an article was excluded if (e-3) it could be classified as a "conceptual article", "review article" or "research protocol", or if (e-4) it did not assess well-defined comparator intervention(s), as traditional care or other. Titles, abstracts and full texts were checked against the inclusion and exclusion criteria by at least two authors independently. Table 1 contains the PICOS criteria for inclusion.

Selection and categorization
One reviewer (Author 1) organized and carried out the initial search and eventually removed duplicates in cooperation with the co-authors. Following this initial stage, a stepwise elimination procedure based on (e1)-(e4) was performed. First, two reviewers (Author 1, Author 2) collaborated to filter records by keywords appearing in the title and the journal name. The keywords used for this purpose were "dental", "dentist", "caries", "children", "oral", and "surgery". For all matches, titles were screened, and records removed if required. In the second stage, two reviewers (Author 1, Author 2) independently screened the remaining titles. In almost 80% of those cases, the reviewers came to a unanimous decision. As a rule, a split decision led to inclusion of the article in question. In the following stage, all reviewers independently screened the abstracts of all remaining records before discussing full-text eligibility.
During subsequent updates, one reviewer (Author 1) performed the filtering process on all new titles. Subsequently, the reviewers (Author 1, Author 3) screened the remaining titles for abstract eligibility. While the first update lead to the inclusion of five new records, no additional articles could be identified during the second update. The third update identified one additional article. Next, each reviewer independently read and analysed the articles identified in the previous stages to decide full-text eligibility. Finally, following a discussion, the team of reviewers reached a consensus on the pool of studies to be included in this review.
Included studies were categorized, inspired by Emmert et al. (2012). Studies that focus on cost and other consequences regarding economic efficiency were grouped into Category 1. Studies evaluating health benefits for patients were placed in Category 2. Category 3 includes articles that assessed the consequences of HBR on health-service usage. Studies with multiple outcome measures were categorized by their primary outcome measure.

Results
The 12 articles that met our eligibility criteria are presented in Table 2. Four studies were conducted in Australia, three in New Zealand, three in Norway and two in the US. The Australian HBR model specifically targets patients with low to medium levels of need (Lewin et al., 2008), whereas the HBR interventions in New Zealand target frailer, older patients on the verge of residential care (Senior et al., 2014). The other studies included did not have a directly specified target group in terms of needs. In all reviewed studies, the multidisciplinary teams were composed of a physiotherapist, occupational therapist, and a nurse. One of the team members functioned as a care manager for each client (Lewin et al., 2016). For data synthesis, a narrative qualitative synthesis of the eligible studies was executed. Narrative summaries and tables were compiled for characteristics and findings of included studies. First author with extensive discussion and agreement involving all authors conducted the synthesis. PRISMA guideline was used when feasible Moher et al., 2009), and the PRISMA checklist can be found in supplementary material section S2. For synthesis, well known guidances were used (Popay et al., 2006;Akers et al., 2009). HBR is a personalized intervention and studies included had a heterogeneous range of outcome measures. We were therefore unable to perform a meta-analysis. Kjerstad and Tuntland (2016) carried out a cost-effectiveness analysis (CEA) of HBR using data from the randomized controlled trial (RCT) by Tuntland et al. (2015). The sample consisted of 61 participants (HBR = 31 and control = 30). The CEA was conducted on 46 participants (HBR = 25 and control = 21). All participants were assessed at baseline, 3-and 9-months. Self-perceived activity performance and satisfaction with performance were chosen as effectiveness measures. There was no significant difference in the mean cost per participant during the intervention period (3 months). At the 9-month follow-up (6 months post-intervention period), the authors found a significant difference in mean cost per visit in favour of HBR. However, the difference of 1.5 € 1 (14.7 NOK) was modest. There was no statistically significant difference in mean cost per participant. The incremental costeffectiveness ratios for the intervention period were -89.5 € 1 (-868.2 NOK) for the activity performance measure and -68.7 € 1 (-666.3 NOK) in terms of satisfaction with performance.

Category 1 -Costs and consequences
Using data from an Australian RCT (Lewin et al., 2013b), Lewin et al. (2014) examined the use of healthcare services and the associated costs of HBR compared to conventional care. Participants were compared at baseline and after 1-and 2-year followups. For the intention-to-treat (ITT) analysis 750 participants were included, 375 in each group. The actual treatment (AT) analysis was conducted on 705 (HBR = 310 and control = 395) participants. A significantly lower proportion of HBR participants patients were approved for residential or equivalent homecare at the end of the study. The HBR group also had a 30% reduced risk for emergency department presentation in the AT analysis. Over the 2-year period, the mean aggregated cost per participant was lower for the HBR group, and the difference was 1,821 € 2 (AU$2,869) in the ITT analysis and 2,754 € 2 (AU$4,338) in the AT analysis. The HBR group was significantly less costly in the first year and over the total 2-year period in the AT analysis only. Randomization of participants was compromised, and there was some measurement bias in hours of service.
In a retrospective study, Lewin et al. (2013a) investigated whether individuals using HBR reduced their need for ongoing services and had lower homecare costs compared to those receiving usual care. By linking several data sources, the authors created a dataset with 10,368 individuals and a time period of 57 months. The individuals received usual care or either of two different HBR versions. In the first HBR version the patients were referred from the community, and in the second version patients were discharged from the hospital. HBR in both versions were less likely to use ongoing services over the first 3 years compared to those getting usual care. This effect persisted over the whole time period for HBR users who were referred from the community. The costs for both HBR groups were substantially less than that for conventional care over the observation period. The median savings per HBR participant after 57 months amounted to more than 7,935 € 2 (AU$12,500) in both HBR groups.

3.2
Category 2 -Health benefits A cluster RCT conducted in New Zealand by King et al. (2012) examined the impact of HBR versus usual care and applied HRQoL as the primary outcome. The following secondary outcomes were included: functional mobility, sense of control and social support network. All outcome data were collected at baseline and at 4-and 7-month follow-ups with structured face-to-face interviews. In total, 186 participants were included at baseline, 93 participants in each group. At the 7-month assessment, 157 participants remained (HBR = 76 and control = 81). HRQoL was measured by the 36-Item Short Form Health Survey (SF36 3 ). The instrument provides separate mental and physical subscores. After adjusting for baseline demographics, the SF36 overall score differences were statistically significant at the 10% level in favour of the HBR group. Splitting the SF36 into the two different components indicated significant results for the mental subscore only. For all the secondary outcomes, no evidence for significant differences was found. Lewin and Vandermeulen (2010), utilizing data collected from 2001 to 2003, used a non-randomized design when investigating whether HBR participants had better personal and service outcomes compared to those receiving usual care. Data were collected manually with standardized outcome measures of functional independence, confidence, and wellbeing. All participants were assessed at baseline, 3 months and 1 year. One hundred participants were included in each group at baseline. At the 1-year follow-up, 140 participants were left (HBR = 67 and control = 73). At both follow-ups, the HBR group showed improvements in all measures, whereas the participants receiving usual care remained mostly the same. These differences were significant and regression analysis also confirmed these results for all measures except the Philadelphia Geriatric Morale Scale 4 . HBR participants also had a statistically significant decrease in the probability of needing ongoing services. The authors pinpointed three major limitations: some potential selection bias, a lack of independent data to confirm the service outcomes and a lack of clinical information. Parsons et al. (2013) used a clustered RCT to determine whether HBR improved physical functioning and social support compared to standard care. The study included 205 participants at baseline, and 197 remained at the 6-month follow-up (HBR = 106 and traditional care = 91). Physical functioning was measured by the Short Physical Performance Battery (SPPB 5 ). The SPPB test contains three elements: standing balance, timed walk and timed rising/sitting from a chair. The results were interpreted conservatively, and therefore, a 1% significance level was used in the primary analysis. All evaluations followed the ITT principle. The HBR group had a significantly greater mean increase in overall SPPB score and in the walk component over time compared to the usual care group. Social support showed no difference over time. There was also no evidence for a significant relationship between allied health referrals and improvement in physical functioning. The authors argue that there is considerable ambiguity in determining whether a clinically meaningful change in physical function can be associated with HBR. Tuntland et al. (2015) conducted a RCT to evaluate the effect of HBR compared to usual care on self-perceived activity performance and satisfaction with performance. Secondary outcomes were physical functioning and HRQoL. Sixty-one participants were assessed at baseline and at 3-and 9-month follow-ups. At the last follow-up 51 participants remained (HBR = 25 and control = 26). The main outcome was measured by the Canadian Occupational Performance Measure (COPM 6 ), and analyses followed the ITT principle. There was a significant mean difference in COPM-Performance at both the 3-and 9-month follow-ups. For COPM-Satisfaction, there was only a significant mean difference after 9 months. All results were in favour of HBR. All differences were below the cut-off value of 2, indicating a clinically relevant change according to the COPM manual. The authors acknowledge this value but also argue that there is a lack of evidence supporting this cutoff value. All the secondary outcomes were insignificant. The study constraints rendered it inevitable that the same healthcare personnel provided services to both groups.    Langeland et al. (2019) presented results of a clinical controlled trial involving 47 municipalities in Norway. Primary outcome was measured with COPM. At baseline, 828 participants where included (HBR = 707 and control = 121), and 348 remained at 12-month follow-up (HBR = 294 and control = 54). Significant mean effects were found in favour of HBR on COPM-Performance and COPM-Satisfaction, both at 10 weeks and 6-month follow-up. A series of secondary outcomes was measured with different instruments. Physical function, measured with SPPB, showed significant treatment effect in favour for HBR at both 6-and 9-month follow up. Health-related quality of life, measured with The European Quality of Life Scale (EQ-5D-5 L 7 ), showed significant treatment effect in mobility, personal care, usual activities, and current health at the 6-month follow-up. Sense of coherence, measured with Sense of Coherence Questionnaire 8 , showed at 6-months follow-up significant effect in favour of the HBR group. Interestingly, all measures, except SPPB, were insignificant at the 12-month follow-up using a significance level of 5%.

3.3
Category 3 -Health services usage An Australian RCT carried out by Lewin et al. (2013b) investigated whether individuals receiving HBR had less need for ongoing services compared to those getting usual care. Data were collected at baseline and at 3-and 12 months. The study also included secondary outcomes by examining functional status and quality of life (QoL) in a subgroup recruited within the RCT group. For the AT (ITT) analysis, 294 (300) participants were recruited to this subgroup at baseline. At the 12-month follow-up, 192 (198) participants remained, and 100 (88) of these received HBR. HBR was found to significantly reduce the probability of using ongoing services. These results hold for the ITT and AT analyses in both follow-ups. Regarding functional status, there was a significant difference between the groups at the 12month follow-up. Functional dependency increased for the usual care group between the 3and 12-month follow-ups but was maintained in the HBR group. The latter results were only significant in the AT analysis. QoL showed no significant difference between groups.
Using an RCT design, Senior et al. (2014) examined whether HBR participants reduced their need for permanent residential care over a 24-month period. The study included secondary outcomes focusing on functional and social health, measured at the 18month follow-up. Patients received HBR either at home or in a short-term facility. Sample size was 105 participants (HBR = 52 and control = 53). Only 17 participants were included in the 18-month follow-up (HBR = 11 and control = 6). All patients included were at high risk of residential care placement. The ITT principle was used in all analyses. For the combined primary outcome of death or residential care, there were no statistically significant results. The insignificant result was a 24% reduction in favour of HBR regarding the probability of residential care or death. All the secondary outcomes showed no statistically significant differences. The authors argued that the lack of blinding constituted a limitation. Tinetti et al. (2002) used a controlled clinical trial to compare usual care versus HBR in areas like functional status, likelihood of remaining at home, duration and intensity of the homecare episode, emergency visits to a physician and emergency department (ED) visits. There were 691 HBR users included, and from a pool of potential control participants, 691 pairs were created. A subset of 382 pairs was created for patients remaining at home after the completion of either HBR or usual care. Data on functional outcomes were only available for this subset. HBR patients were significantly more likely to remain at home after completion of the homecare episode. The study showed no significant difference in the likelihood of visits to a physician's office. HBR patients were less than half as likely to have an ED visit during the homecare episode. Patients in the HBR group had significantly shorter homecare durations compared to those getting usual care. Discharge scores for selfcare, home management and mobility were all slightly significantly higher for HBR users. Tinetti et al. (2012) aimed to analyse the frequency of hospital readmissions for HBR compared to usual care after an acute hospitalization. Data were based on the original clinical trial study (Tinetti et al., 2002). In total, 770 participants were included, comprising 341 matched pairs and 88 additional unmatched participants. Outcome variables were hospital readmission and length of homecare episode. The algorithm previously used in Tinetti et al. (2012) was applied to generate matched pairs. The mean length of homecare episodes was significantly different between the two groups. The HBR group mean length was shorter than that of the control group. According to a conditional logistic regression analysis, HBR participants were 32% less likely to be readmitted than participants receiving usual care in the matched pair analysis. For the unmatched analysis, the corresponding result was 29%. The statistical significance was only marginal, with p-values for the matched and unmatched analyses of 0.10 and 0.09, respectively.

Assessment
We devised an instrument suitable for the assessment of research papers that are related to the complex topic of HBR. Scores for each study is presented in Table 2, whereas the maximum possible score is 15 points. The instrument is presented and explained in the supplementary material Section S3. For more in-depth presentation of the results and detailed scores reflecting the assessment, see Section S4 in supplementary material.
Analysing the outcomes of the assessment process suggests that while a typical HBR paper describes the motivation and all aspects of the research question in a satisfactory manner, the documentation of data-related issues could clearly be improved. The latter issue also seems to contribute slightly more to the heterogeneity in quality.
The most striking outcome of the assessment is that the majority of the HBR papers under review failed to be informative about key aspects of the statistical modelling. This is surprising, since due to the nature of our selection process, all papers under review appear to rely on statistical methodology. One can group the techniques implemented into two groups, i) mean comparisons, both parametric and non-parametric, and ii) regression analyses. Table 3 lists the different models and inferential techniques applied in the context of the primary outcomes.
Apparently, various types of regression models feature prominently in the HBR literature. According to our assessment, it is a prominent feature of the published HBR literature that the choice of such a model is virtually never justified. Alternative modelling approaches are not explicitly discussed. Models are not presented explicitly. Underlying key assumptions are not documented, and it is typically not substantiated that they hold considering the data collected. The 'path' leading from the data to the model is not made explicit. This, of course, has negative ramifications for the reader's ability to critically appraise the results as well as for the replicability of the research documented. To be clear on this point, we do not believe that the authors ignored the stated aspects of statistical modelling in the research process. We simply point out the fact that, for whatever reason, there is not enough space allocated to such considerations in the publications under scrutiny.
Responses to item 14 regarding external validity suggest that the HBR studies existing so far still lack external validity. The fact that all studies were assigned a '0' score on the item regarding theoretical foundation does not come as a surprise. Statistical methods for primary outcome in each study GLM = Generalized linear model. GLMM = Generalized linear mixed model. MM = Mixed model, also called mixed effects models (Cameron and Trivedi, 2005). Cox-Hazard = Cox proportional hazard model.

Discussion
In our view, none of the papers scrutinized provided sufficient information about the data or the statistics employed. We do not believe that this evidence is indicative of the quality of the underlying research process. More likely, our findings reflect an established publication standard idiosyncratic to the health and medical journals where these studies were published. Knowledge of the sampling procedure and the process of data generation is essential for choosing an identification strategy. Without this information, the reader will not be able to fully understand the data or the strengths and weaknesses of the study. In ten out of twelve studies applied regression and models were not presented. None of the studies provided information regarding the estimation technique used or possible adjustments of the standard errors. Not providing this type of essential information leads to a lack of transparency that in turn will reduce the replicability of a study. Since seven of the included studies were RCTs, it is interesting to discuss RCTs more explicitly. RCT is often considered the "gold standard" approach, if such a standard exist (Cartwright, 2007). In biostatistics, RCTs are often viewed as the only credible approach, while experimental evaluations traditionally have been less common in economics (Imbens and Wooldridge, 2009). The primary benefit of an experiment lies in the fact that it solves the selection bias problem, not by removing the bias but by balancing the bias between the experimental groups (Heckman and Smith, 1995). Experiments also provide a generalizable estimate of the treatment effect for the population when the sample size is large (MacLeod, 2017). Computing the results of an RCT is fairly straightforward. However, for statistical inference one needs to estimate the standard errors, which is more complicated (Deaton, 2010). There are several alternatives for testing the significance of differences in means, but the workhorse for these computations is regression. As Table 3 indicates, most of the studies included used regression, and six of the seven RCT studies relied on regression. Freedman (2008a) points out that it is common practice to adjust data from clinical trials using regression models and the like, which is also confirmed by the observations in this study. The standard way of performing a regression on data from clinical trials is to regress the outcome variable on one assignment variable, including a constant, and often control for baseline covariates. Freedman (2008a) analysed this model in detail and concluded that this standard way is nothing like a standard regression. He shows that the main issue is the dependence between the assignment variable and the error term, which violates key ordinary least squares assumptions. This could bias the estimated treatment effect substantially in small samples. The bias tends to decrease as the number of participants increases, but it is possible that a regression without covariates may render superior results. It is difficult to identify the studies in our analysis that use regression and OLS, but there are clues pointing at two studies (Lewin et al., 2013a;Parsons et al., 2013). Freedman (2008b) also discussed the issues of using logit/probit regression on experimental data. His key finding is that randomization does not justify the assumptions underlying these models because the outcome value is deterministic given the assignment value. Under a logit model, the outcome variable is interpreted as being random. Consequently, the usual maximum likelihood estimates could be inconsistent. The main problem here is not necessarily that these models have been used, as there are ways to solve the apparent problems, but rather the lack of discussing potential drawbacks. Freedman (2008a) (p. 12) states this issue quite sharply: "Practitioners will doubtlessly be heard to object that they know all this perfectly well. Perhaps, but then why do they so often fit models without discussing assumptions?" There are some non-technical problems with experiments, and these are more difficult to solve. Randomized experiments in the social setting often rest on unstated assumptions, especially considering the behavioural response of the participants, whose behaviour is often altered due to the randomization (Heckman, 1991). Randomization bias, or deviations from assignment, cannot necessarily be treated as random measurement error and can therefore influence the results (Deaton, 2010). None of the RCT studies discussed the latter aspects. The RCT technology may constitute a powerful tool in applied situations when the underlying assumptions are met. Often these assumptions are not arguably better than assumptions found in non-experimental econometrics and statistics (Heckman, 1991). We would like to emphasise that the above discussion is based on a checklist designed by us, which has yet to be validated.
One of the objectives of this paper is to provide an overview of economic evaluations of HBR. Our review effort differs from earlier attempts (Tessier et al., 2016;Legg et al., 2016), especially in terms of "wider" inclusion criteria with fewer limitations on study type and outcome measures. Three studies estimated the cost differences between HBR and usual care after the intervention, and all showed lower costs for HBR participants (Kjerstad and Tuntland, 2016;Lewin et al., 2013a;Lewin et al., 2014). For one of the studies, there were significant differences in mean cost in the AT analysis and not ITT (Lewin et al., 2014). For the two other studies either no significance was detected (Kjerstad and Tuntland, 2016) or significance was not reported (Lewin et al., 2013a). A rough estimate of the potential yearly homecare cost reduction per participant due to HBR lies between 800 and 1,700 €.
Scrutinizing columns five and six of Table 2, one finds similar inclusion and exclusion criteria defining the pool of participants in the various HBR studies. Most studies applied narrow inclusion criteria requiring that patients eligible for care have to be older than 65 years. An exception are the Norwegian studies, where minimum age was set to 18 years. There are, however, only small variations in the mean age of included participants (HBR: 76 -82 and Usual care: 77 -83). An additional trait common to the studies reviewed is the length of the HBR intervention itself, which was a maximum of 12 weeks. In the New Zealand version, in which participants were referred from the hospital, the length was limited to 8 weeks. In the most recent Norwegian study, the intervention length varied between 4 to 10 weeks (Langeland et al., 2019). Two studies (King et al., 2012;Parsons et al., 2013), failed to be informative with respect to this aspect. According to our observations, the length and amount of HBR administered, was hardly ever explicitly explained. Studies examining potential health benefits from HBR do not use one standardized instrument. Directly comparing the results then becomes difficult. We will therefore focus the discussion on whether there were some common trends in terms of statistical significance for potential health benefits.
Physical functioning or independence were the potential benefit categories where we found the most studies. Lewin and Vandermeulen (2010) reported that the HBR group scored significantly better on all physical measures after 3-and 12-month follow-ups. These results are consistent with earlier studies examining short-term effects (Tinetti et al., 2002) and a more recent study (Parsons et al., 2013). In contrast, three studies showed no statistical significance in either functional mobility or ADL (King et al., 2012;Tuntland et al., 2015;Senior et al., 2014). These studies all included physical gain as a secondary outcome and had longer follow-up period, between 7 -18 months. The exception from the above finding was the study from Langeland et al. (2019), which found significant effect on the secondary outcome physical functioning. There is no clear evidence supporting the notion that HBR significantly increases physical functioning. However, HBR tended to lead to superior results on the selected instruments.
Four studies in our review relied on HRQoL or QoL to measure increased health benefits (King et al., 2012;Langeland et al., 2019;Lewin et al., 2013b;Tuntland et al., 2015). However, only one study had change in HRQoL as the primary outcome (King et al., 2012). This study showed a promising result as the mental health component of SF36 was the main driver for the increased score for the HBR group. The three remaining studies measuring HRQoL or QoL reported insignificant differences between HBR and usual care at the final follow-up period, varying between 9 -12 months. To summarize, there is no convincing long-term evidence that HBR increase HRQoL or QoL. Regarding other selfperceived health benefits, the results are also not definite. Two studies (King et al., 2012;Parsons et al., 2013) reported no significant difference in social support measured with the Duke Social Support Index (DSSI) (Koenig et al., 1993). Regression results from assessing the state of psychological well-being of older people also showed no significant difference at the 12-month follow-up (Lewin and Vandermeulen, 2010). Self-perceived activity performance and satisfaction with that performance was analysed in two studies (Tuntland et al., 2015;Langeland et al., 2019). Both the performance and satisfaction measures were significantly better for the HBR group at the 6 or 9-month follow-up. However, the treatment effect was not clinically relevant.
In an unadjusted analysis, it was demonstrated that HBR users were significantly less often assessed and approved for a higher level of care in a 2-year perspective (Lewin et al., 2014). Senior et al. (2014) observed that HBR reduced the probability of death or permanent residential care, but their observations lacked statistical significance. It was also shown that HBR users were less than half as likely to have an ED visit during the home care episode (Tinetti et al., 2002). Over a 2-year period, HBR recipients had significantly less ED presentations compared to individuals receiving the baseline treatment, though these results only hold for the AT analyses and were unadjusted (Lewin et al., 2014). The latter findings also hold for the number of hospital admissions. Moreover, an earlier study concluded that HBR participants were less likely to be readmitted to the hospital compared to subjects under usual care, a result that was only significant at a 10% level (Tinetti et al., 2012). In addition, HBR is showing some promising results with respect to reducing the need for specialist or residential care. In the first study included in this review, it was shown that HBR participants were significantly more likely to remain at home after a homecare episode (Tinetti et al., 2002). This effect seems to hold in a 12-month perspective (Lewin et al., 2013b). There is evidence for the fact that relative to usual care, HBR significantly reduces the number of homecare hours and visits as well as the general duration of homecare episodes in the long-term (Kjerstad and Tuntland, 2016;Lewin et al., 2014;Tinetti et al., 2002;Tinetti et al., 2012).

Conclusion
This review summarizes and assesses the currently available literature on empirical evaluations of HBR. In short, the existing evidence regarding the effects of HBR is still inconclusive. The results are inconsistent, as some studies report a significant positive effect of HBR versus usual care, while others fail to establish such an effect. However, so far it has not been established that HBR renders negative effects. In addition, this review provides a critical, constructive assessment of the associated publication process. We understand that HBR is a complex intervention implemented in an equally complex setting. Out of this understanding grows the utmost respect for all current research efforts aimed at estimating the effects of HBR. The research reviewed provides a basis to build on. With complex interventions in social settings, there might also be a need for a variety of analytical perspectives to capture this complexity. To ensure successful future research efforts, the multidisciplinary nature of HBR needs to be reflected in the diversity of the research teams taking on the challenge.