2012年4月27日 星期五

your comments are appreciated... invite you as a coauthor

4/27  補上 Discussion
4/19 補上 bootstrap 數據,感謝業太的高效率分析(he has been invited as a coauthor)
4/17 方法部份大致寫妥,詳下文。
同前所言:我誠摯邀請各位給建議,若有 critical comments 者,我將邀請其擔任共同作者。

下一個主要研究主題是延宕已久的 individual-level responsiveness
Dr. Jeremy C Hobart (JNNP 2010; 81:1044-1048) 提出以IRT分析後所得之各別個案SE,計算各別個案之進步量是否超過 1.96*SE, if so, 則代表該個案達到 statistically significant improvement。所以,以此方法可比較不同量表之individual-level responsiveness。

昔日研究,包含我發表的一些文獻,已呈現短版評估工具比原始版具有類似的 group-level responsiveness (通常以 effect size 驗證)。

我可選擇Hobart 所提之IRT方法,以驗證我之前的「短版」評估工具(如 short form PASS, short form BBS等)之individual-level responsiveness,然而還需IRT分析,有些麻煩。

另外, 今天剛收到 short form PASS 的 MDC 稿件被接受刊登信函(投了3個期刊,費時約2年)。

因此就想到以 MDC 取代 1.96 SE,就毋須用到IRT了。

我找筆資料,分析結果與摘要如下:

Comparison of individual-level responsiveness between the original and short-form Postural Assessment Scale for Stroke Patients 
Background and purpose: We have examined and shown that responsiveness of the Postural Assessment Scale for Stroke Patients (PASS) and short form PASS (SFPASS) at group level was similar. The result is counterintuitive because the PASS has more items (12) and response options (4) than those of the SFPASS (5, 3, respectively). However, the individual-level responsiveness of both measures remains unknown, which affects clinicians and researchers on the selection of a competitive measure. Thus, the purpose of this study was to compare individual-level responsiveness between the PASS and SFPASS.
Method: A total of 179 stroke patients were assessed using the PASS at 14 days and 30 days after onset. The SFPASS scores were calculated from the patients’ responses on the PASS. We calculated individual-level responsiveness on the basis of the value of minimal detectable change (MDC). If a patient’s change score is beyond the MDC of the PASS or SFPASS, his/her improvement is significant. The MDCs of the PASS and SFPASS were obtained from previous studies. We examined the difference of the number of patients scoring beyond MDC of the PASS and SFPASS between 14 days and 30 days after onset.
Results: We found that 47.5% of the patients scored beyond the MDC of the PASS and that 36.1% of the patients scored beyond that of the SFPASS. The difference was significant (P < 0.001).
Conclusion:  The PASS had better individual-level responsiveness than the SFPASS and is recommended for clinical trials. To comprehensively report effect of clinical trials, future studies used the PASS should report individual-level effect (number of patients scoring beyond MDC) in addition to group-level effect (e.g., effect size).


Introduction:

        A short and psychometrically sound measure provides clinicians and researchers an efficient way to quantify patients’ outcomes. Recent studies have shown that short-form measures have similar psychometric properties, particularly responsiveness (a critical index of outcome measures), to original or long-form measures.1-4 However, Hobart et al argues that such a similar responsiveness between short forms and long forms might be because of group-level comparison.5 They used standard error generated from item response theory (IRT) for each participant to calculate individual-level responsiveness.  Results show that a short measure (the 10-item, 2-4-response-option Barthel index) has less individual-level responsiveness than a long measure (the 13-item, 7-response-option Functional Independence Measure).5 Thus, individual-level responsiveness is critical for clinicians and researchers on the selection of competing measures.
        Although IRT-based standard error is useful for calculating individual-level responsiveness, most measures have been developed and examined using classical testing theory (CTT) primarily because of simplicity. CTT can also generate similar index for random error (called minimal detectable change, MDC, or smallest real difference). The MDC is the smallest threshold of change scores that are beyond random error at a certain level of confidence (usually 95%).6, 7 Thus, the MDC can be used as the safest threshold for identifying statistically significant individual changes.6 Therefore, the MDC is simple, useful for estimating individual-level responsiveness of a measure.
        We have shown similar group-level responsiveness of the Postural Assessment Scale for Stroke Patients (PASS) and short form PASS (SFPASS).1 The result is counterintuitive because the PASS has more items (12) and response options (4) than those of the SFPASS (5 items and 3 response options). However, the individual-level responsiveness of both measures remains unknown, which affects clinicians and researchers on the selection of a competitive measure. Thus, the purpose of this study was to compare individual-level responsiveness between the PASS and SFPASS.


Method
Data were available from a longitudinal follow-up study. Each subject in the study was assessed at 14 days after stroke onset and reassessed at other specific time points (e.g., 30 days after onset) to characterize their balance ability (e.g., as measured by the PASS) and recovery of neurological impairments. Subjects met the following criteria: (a) first or recurrent onset of cerebrovascular accident without other major diseases (e.g., cancer, dementia, severe rheumatoid arthritis); (b) ability to follow verbal instructions to complete the PASS; and (c) ability to provide informed consent personally or by proxy. Subjects were excluded if they had another stroke or other major disease/s during the follow-up period. We also excluded patients with highest possible score (i.e., 36) of the PASS because these patients had no room to improve on the PASS or SFPASS.  The study was approved by the institutional review board of a university hospital.
Procedure
The PASS was administered by an occupational therapist who was not informed of the purpose of this study. The patients were assessed at hospital or their home. The scores of the SFPASS were obtained from the PASS.

Measures
The PASS was specifically developed to assess balance function in people with stroke.4 The PASS contains 12 four-level (0-1-2-3) items assessing a person’s balance performance in situations of varying difficulty, i.e. maintaining or changing a lying, sitting, or standing position. Its total score ranges from 0 to 36 and the psychometric properties of the PASS were found to be satisfactory when used to assess people with stroke. The MDC of the PASS was 3.2,which was estimated on the basis of 52 patient with chronic stroke.
The SFPASS has 5 three-level items which are listed in Appendix. The 5 items are selected from the original PASS with the best measurement properties (i.e., higher internal consistency and greater responsiveness). The middle level of the SFPASS is created by combining the middle two levels (1 and 2) of the original PASS. Thus, both the items and scores of the SFPASS can be obtained from the scores of the PASS. The possible score of the SFPASS ranges from 0 to 15. Psychometric properties (including reliability, validity and group-level responsiveness) of the SFPASS were very similar to the original PASS.  The MDC of the SFPASS was 2.2, which was estimated on the same patients for estimating MDC of the PASS.


Data analysis

Group level comparison
Three indicators were used to examine the group level responsiveness. First, the Kazis’ effect size was calculated by dividing the mean changes by the standard deviation of the baseline scores obtained at 14 days after stroke onset. Second, the standardized response mean (SRM) (another type of effect size) was calculated by dividing the mean changes by the standard deviation of the change in scores. According to Cohen’s criteria, an effect size greater than .8 is large, .5 to .8 is moderate, and .2 to .5 is small. Third, paired t-test was performed to examine the statistical significance of the changes in scores from 14 days to 30 days after onset.
In addition, to compare the responsiveness between the PASS and SFPASS, we estimated the 95% confidence intervals of Kazis’ effect size and SRM to test the differences between the above measures by 10,000 bootstrap samples.


Individual person level comparison
The MDC of both PASS and SFPASS (3.16 and 2.16, respectively) were retrieved from a recent study. The MDC was estimated on the basis of test-retest reliability investigation, in which 52 patients with stable condition were tested twice, one week apart. The MDC based on the standard error of measurement (SEM) was calculated by the following formula:
                  MDC = z-score * √2 * SEM                                      (1)
                                       SEM =                                                                (2)
The z-score (1) represents the confidence interval (CI) from a standard normal distribution (1.96 for 95% CI was used in this study). The SEM was calculated by the square root of the error variance including systematic differences (2), which can be obtained from the ANOVA table (de Vet et al., 2006a).
            The relative responsiveness of the PASS and SFPASS was compared at the individual person level. First, we calculated the size of change score of each patient (“score at 30 days after onset” - “score at 14 days after onset”). Second, we calculated whether the change score was larger than the MDC. Finally, we categorized the significance of each patient’s change into one of the three groups according to the size and direction of the significance of change score. The first group was significant improvement: change score ≥ MDC. The second group was non-significant improvement: 0 ≤ change score < MDC. The third group was others (no change and worsening).
            Finally, we counted the numbers of patients in each group. The distributions for both balance measures were compared using a x2 test and relative risk statistics.

Results
            Three-hundred-and-one patients were assessed using the PASS at 14 days after a recent stroke onset. A total of 41 patients were not followed because of having achieved highest possible score of the PASS, unstable condition, and unnoticeable discharge. Two-hundred-and-sixty patients were assessed both time points and their data were used for further analyses. These patients had a wide range of balance impairment (from bed ridden to nearly able to stand on the affected leg for 10 seconds).


Group level comparison of both measures
            Kazis effect size and SRM showed moderate to large responsiveness (0.46 ~ 0.91) of both measures in detecting changes from 14 days to 30 days after stroke (Table 3[暫訂]). Particularly, the 95% CIs of the two effect size indices of PASS and SFPASS were largely overlapped with other. The changes of the two measures were all significant (P< 0.001). 


Table 3. Group level responsiveness of both balance measures (n=251)
Measure
Kazis’ effect size (95% CI)*
Standardized response mean (95% CI)*
paired t (P)
PASS
0.46 (0.39~0.53)
0.91 (0.82~10.1)
14.4 (<0.001)
SFPASS
0.48 (0.39~0.56)
0.83 (0.71~0.91)
13.1 (<0.001)
* Estimated by 10,000 bootstrap samples.



Discussion
The purpose of this study was to determine whether the SFPASS has equal ability to detect change with the PASS. Particularly, the PASS has more items and response categories and shows more potential to detect change than the SFPASS. However, we found similar responsiveness of the PASS and SFPASS at group level as shown by Kazis effect size and SRM. These results were confirmed by 10,000 bootstraping samples. Importantly, the PASS had better individual-level responsiveness than the SFPASS. The PASS could detect significant recovery of balance function in more patients than the SFPASS. Thus, the PASS showed better ability to detect change than the SFPASS.
Our findings also imply that studies using the PASS or SFPASS as an outcome measure should report individual-level effect (i.e., number of patients scoring beyond MDC) in addition to group-level effect (e.g., effect size) in order to comprehensively report effect of clinical trials. Such information would also help clinicians interpret their clinical observations on the basis of objective measurement properties (i.e., the change observed on each patient has to be beyond random measurement error [e.g., MDC]).
Our study raises two issues for researchers to examine responsiveness of an outcome measure. First, individual-level responsiveness is strongly recommended for examining responsiveness for outcome measures, particularly for comparing competing measures (e.g., short forms vs long forms). The results will be critical for clinicians and researchers to select for competing measures on the basis of comprehensive empirical evidences.
Second, our findings support the argument that group level indicators of responsiveness (e.g., Kazis effect size and SRM) are inappropriate or limited.5, 16 The group level indicators of responsiveness could not demonstrate different level of responsiveness between a short measure (the Barthel index) and a long measure (the Functional Independence Measure).5, 16, 17 However, the superiority of the Functional Independence Measure to detect change than the Barthel index is demonstrated by individual-level analyses.5 These observations indicate that group-based indices of responsiveness can be misleading.
There are two limitations in this study. First, our patients were followed at subacute stage. Only few patients, as expected, deteriorated during the follow-up periods. Thus the comparison of the ability of both measures to detect deterioration remains unknown. Second, the scores of SFPASS were retrieved from those of the PASS. Future studies might need to validate our findings using the PASS and SFPASS independently.
In brief, the PASS showed better individual-level responsiveness than the SFPASS and is recommended for clinical trials and clinical settings. Future studies using the PASS should report individual-level effect (i.e., number of patients scoring beyond MDC) in addition to group-level effect (e.g., effect size) in order to comprehensively report effect of clinical trials.



我誠摯邀請各位給建議,若有 critical comments 者,我將邀請其擔任共同作者。





13 則留言:

  1. 是否也可比較PASS和SFPASS超過MID的個案比例呢?

    回覆刪除
  2. It's a good idea. However, the PASS's MID has not been estimated yet.

    回覆刪除
  3. 就以我的直觀,我會覺得反應性應該不會受到項目數的影響,但項目分數的尺度可能會影響反應性(尺度越大,能把受測者能力分得越細)。這只是我的看法,歡迎老師指正。

    回覆刪除
  4. to En-Chi: 那題數(多)或增加題數之功能為何?
    Note: 我們不擔心「主觀」或「直覺」,我們考量「理論依據」與「實證」。
    也就是說,您須提出「依據」。

    回覆刪除
  5. PASS似乎比SFPASS可多偵測出約10%有進步的個案,但PASS的施測時間比SFPASS不知道多多少呢? 對clinicians老師建議使用哪一個呢? 在clinical trials,為了得到較完整正確的療效結果,使用PASS應該較好,但對於時間有限的臨床實境,是否讓臨床人員自行決定?

    回覆刪除
  6. 當然是 user centered!
    只是也要看 user 的判斷能力與取捨重點。

    回覆刪除
  7. 我有2個疑問:
    1.請問是不是題目數和量尺數都會影響responsiveness?
    如果是,則用短版的題目,但是維持原本的量尺數,可否改善individual responsiveness?
    2.在前言的第一段介紹一篇文章的研究結果似乎有點過於強調這篇研究,不知是否有其他支持或反對的研究結果?

    另外有2個建議:
    1.是否在前言可簡單介紹一下PASS?因為PASS在臨床還沒有被普遍使用。
    2.或許可以在別篇研究探討臨床人員選用評估工具的優先順序。例如:準確、快速、便宜、容易取得等,作為未來推廣量表的參考。

    回覆刪除
  8. Q1. I think so..
    Q2. 「這篇」指 which one? 您有查過文獻嗎? it would be more specific and critical.
    建議1.我考慮,因為不見得是「重點」,且將影響連貫。
    建議2. yes, you too. [所提主題]您可以列為未來寫作之題材。

    回覆刪除
    回覆
    1. 關於Q2.
      抱歉,是我沒有把問題寫清楚。
      這篇是指Hobart等學者的文章。
      在回應這篇之前,我有簡單地在Medline以(responsiveness AND (group AND individual)).ab.搜尋相關文獻,但是並沒有找到比較不同量表的團體/個別反應性差異的文章。
      因為不確定自己是否漏掉重要的文獻,所以才提出問題。

      關於建議2.
      好的。謝謝您的建議。

      刪除
  9. Q2 就我所知,MDC可用以探討 individual level 的 change, 但似未有研究實際應用之。所以應不易找到其它文獻。

    回覆刪除
  10. 也就是 Hobart 可能是全世界第1個提出此概念者!

    回覆刪除
  11. 一點意見。。。Discussion第二段第四行...Such information would help clinicians...當從前面句子接著看下來,會不太確定such information指的是什麼("individual-level effect"?)

    回覆刪除