Bullet_General.gif (3885 bytes) Least-Squares curve fitting with statistical analysis by using an arbitrary order polynomial as approximating function. (線上執行) (原始碼下載)

上次已經介紹過 Least-Squares curve fitting by using an arbitrary order polynomial as approximating function.,但是並沒有包含統計分析,因此無法判斷該用幾階多項式當作近似函數,這次提供的 LS 副程式除了可自動求得變異數(variance)最小的多項式之外,亦計算三個額外的數據如下:

  • 總誤差 = VB_Numerical-13.gif (302 bytes):其中 yi 是已知數據點, Yi 是近似函數所求得之數據點,m是數據點的總數。
  • 變異數 (variance) = 總誤差 / (m - N - 1):m是數據點的總數,N是多項式之階數或羃次(degree)。
  • 相關係數(correlation factor)的平方 (R-squared):令 S=VB_Numerical-13-2.gif (294 bytes)yi 是已知數據點,VB_Numerical-13-3.gif (79 bytes)yi 總合的平均值,m是數據點的總數,則 R-squared = (S - 總誤差) / S。

 

副程式

Public Sub LS(MinD As Long, MaxD As Long, x() As Single, f() As Single, _
                  ReturnCoeff() As Single, ReturnSta() As Single)
     Dim d As Long, i As Long, j As Long, no As Long, m() As Single, Tfx As Single
     Dim row As Long, col As Long, n() As Single, fx As Single, St As Single
     Dim MinVar As Single, ExactD As Long, coeff() As Single, sta() As Single
     ReDim coeff(MinD To MaxD, 1 To MaxD + 1), sta(MinD To MaxD, 1 To 3), ReturnSta(1 To 3)
     no = UBound(x)
     If MinD > MaxD Then MsgBox "羃次下限必須不大於羃次上限": Exit Sub
     If no <> UBound(f) Then MsgBox "xf(x)的數目不吻合": Exit Sub
     If no <= MaxD + 1 Then MsgBox "資料點數目必須大於羃次+1": Exit Sub
     For d = MinD To MaxD
          row = d + 1: col = d + 2: St = 0: fx = 0: Tfx = 0
          ReDim m(1 To row, 1 To col), n(0 To 2 * d)
          For i = 0 To 2 * d
               For j = 1 To no
                    n(i) = n(i) + x(j) ^ i
               Next
          Next
          For i = 1 To row
               For j = 1 To row
                    m(i, j) = n(i + j - 2)
               Next
          Next
          For i = 1 To row
               For j = 1 To no
                    m(i, col) = m(i, col) + x(j) ^ (i - 1) * f(j)
               Next
          Next
          Call GEMP(m, ReturnCoeff)
          For i = 1 To no
               fx = ReturnCoeff(d + 1)
               For j = d To 1 Step -1
                    fx = ReturnCoeff(j) + fx * x(i)
               Next
               Tfx = Tfx + f(i)
               sta(d, 1) = sta(d, 1) + (f(i) - fx) ^ 2
          Next
          Tfx = Tfx / no
          For i = 1 To no
               St = St + (f(i) - Tfx) ^ 2
          Next
          sta(d, 2) = sta(d, 1) / (no - d - 1)
          sta(d, 3) = (St - sta(d, 1)) / St
          For i = 1 To d + 1
               coeff(d, i) = ReturnCoeff(i)
          Next
          If d = MinD Then MinVar = sta(d, 2) + 1
          If sta(d, 2) < MinVar Then MinVar = sta(d, 2): ExactD = d
     Next d
     ReDim ReturnCoeff(1 To ExactD + 1)
     For i = 1 To ExactD + 1
          ReturnCoeff(i) = coeff(ExactD, i)
     Next
     For i = 1 To 3
          ReturnSta(i) = sta(ExactD, i)
     Next
End Sub

其中 GEMP 這個副程式,請參閱本站的 Solve a set of simultaneous linear equations by Gauss Elimination with maximization of pivot elements.

 

參數說明:

  • MinD、MaxD:MinD 代表欲 Fit 多項式的羃次下限,MaxD 代表欲 Fit 多項式的羃次上限,LS 副程式將在這個範圍內求得變異數最小的多項式。
  • x():自變數之值,陣列基底為 1。
  • f():應變數之值,陣列基底為 1。
  • ReturnCoeff():傳回多項式之係數,陣列基底為 1。例如:ReturnCoeff(1)代表常數項之係數,ReturnCoeff(2)代表 1 次方之係數,依此類推。
  • ReturnSta():是一有三個元素的一維陣列,ReturnSta(1) 傳回總誤差,ReturnSta(2) 傳回變異數,ReturnSta(3) 傳回 R-squared。

 

範例:

試利用最小平方近似法,求出以下數據之最適多項式近似函數。(比較1~4階多項式之變異數,變異數最小者為最適多項式)

x 0.05 0.11 0.15 0.31 0.46 0.52 0.70 0.74 0.82 0.98 1.17
f(x) 0.956 0.890 0.832 0.717 0.571 0.539 0.378 0.370 0.306 0.242 0.104
Private Sub Command1_Click()
     Dim i As Long, keyin As String, A() As Single, f() As Single
     Dim x() As Single, MinD As Long, MaxD As Long, sta() As Single
     keyin = InputBox("請輸入自變數x值,以空白來區隔。" & vbNewLine & "例如:1 2 3", "x")
     If Not SepStrToNumArray(keyin, " ", 1, x) Then GoTo WrongValue
     keyin = InputBox("請輸入應變數f(x)值,以空白來區隔。" & vbNewLine & "例如:10 20 30", "f(x)")
     If Not SepStrToNumArray(keyin, " ", 1, f) Then GoTo WrongValue
     keyin = InputBox("請輸入欲 Fit 多項式的羃次(degree)下限", "多項式的羃次下限")
     If Not IsNumeric(keyin) Or (keyin Like "*.*") Or (keyin Like "*-*") Then GoTo WrongValue
     MinD = CLng(keyin)
     keyin = InputBox("請輸入欲 Fit 多項式的羃次(degree)上限", "多項式的羃次上限")
     MaxD = CLng(keyin)
     Call LS(MinD, MaxD, x, f, A, sta)
     For i = 1 To UBound(A)
          Debug.Print A(i)
     Next
     Debug.Print
     For i = 1 To 3
          Debug.Print sta(i)
     Next
     Exit Sub
WrongValue:
     MsgBox "輸入了無效的數值"
End Sub

其中 SepStrToNumArray 這個副程式,請參閱本站的 將字串拆解成陣列

MinD 傳入 1,MaxD 傳入 4,執行以上的程式將可在VB的即時運算視窗顯示以下結果:

0.9979691 
-1.018045 
 0.2246844 

 1.867514E-03 
 2.334393E-04 
 0.9976708 

由此可知 2 階多項式為最適多項式:0.9979691 - 1.018045x + 0.2246844x2
其總誤差 = 1.867514E-03,變異數 = 2.334393E-04,R-squared = 0.9976708

如果不想求得最適多項式,只想計算特定階數的多項式,則只要 MinD=MaxD 即可。

x 1.4 2.5 3.1 3.9 5.0 7.1 9.5 11.9 14.1 15.0 16.5 17.2
f(x) 3.8 2.6 4.1 5.2 6.2 6.9 7.2 6.7 5.8 3.8 3.8 2.8

例如以上數據想以 2 階多項式來 fit,則執行 Command1,羃次下限和羃次上限皆輸入 2 ,即可在VB的即時運算視窗顯示以下結果:

1.130045 
 1.284787 
-6.935126E-02 

 4.294727 
 0.4771919 
 0.85236 

3/31/1999 補充:

如果數據點很多的時候,使用 LS 副程式通常會發生很嚴重的捨入誤差 (roundoff error),解決辦法是將整個專案的 Single 變數改宣告成 Double,請參考以下的範例:

以下的 x 和 f(x) 是擷取某個實驗的數據,並以 6 階多項式來 fit。

Private Sub Command2_Click()
     Dim i As Long, keyin As String, A() As Single, f() As Single, x() As Single, sta() As Single
     keyin = "0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045 0.05 0.055 0.06 0.065 0.07 0.075 " & _
     " 0.08 0.085 0.09 0.095 0.1 0.105 0.11 0.115 0.12 0.125 0.13 0.135 0.14 0.145 0.15 0.155 0.16 " & _
     " 0.165 0.17 0.175 0.18 0.185 0.19 0.195 0.2  0.205 0.21 0.215 0.22 0.23 0.24 0.25 0.26 0.27 0.28 " & _
     " 0.29 0.3 0.31 0.32 0.34 0.36 0.38 0.4 0.42 0.44 0.46 0.48 0.5"
     Call SepStrToNumArray(keyin, " ", 1, x)
     keyin = "0.71 0.732 0.748 0.761 0.772 0.78 0.787 0.793 0.798 0.799 0.797 0.792 0.786 0.78 0.776 " & _
     " 0.765 0.759 0.752 0.743 0.736 0.723 0.716 0.703 0.696 0.683 0.675 0.664 0.657 0.648 0.639 0.633 " & _
     " 0.623 0.618 0.613 0.607 0.6 0.598 0.593 0.589 0.582 0.579 0.577 0.575 0.574 0.573 0.57 0.568 0.567 " & _
     " 0.567 0.567 0.57 0.571 0.573 0.576 0.579 0.582 0.583 0.584 0.588 0.591 0.593 0.5955 0.597 0.597"
     Call SepStrToNumArray(keyin, " ", 1, f)
     Call LS(6, 6, x, f, A, sta)
     For i = 1 To UBound(A)
          Debug.Print A(i)
     Next
     Debug.Print
     For i = 1 To 3
          Debug.Print sta(i)
     Next
End Sub

其中 SepStrToNumArray 這個副程式,請參閱本站的 將字串拆解成陣列

執行 Command2 之後,即可在 VB 的即時運算視窗顯示以下結果:

0.9745973 
-19.22277 
 410.0353 
-3540.429 
 13968.49 
-25375.8 
 17242.27 

 0.5052314 
 8.863708E-03 
-0.1236065 

由以上的數據可知,相關係數 (correlation factor) 的平方值已經小到 -0.1236065,這是非常怪異的現象,除非原始數據非常 noise,不然不太可能會發生這種情形。接著我們把整個專案的 Single 變數改宣告成 Double (用 VB 的 "取代" 功能),其他數據維持不變,再執行 Command2 一次,即可在 VB 的即時運算視窗顯示以下結果:

0.706363278353505 
 4.97858806199404 
-85.8230376299251 
 493.616610321853 
-1339.07277833309 
 1759.51765831704 
-904.758357221887 

 2.0610905854279E-04 
 3.61594839548754E-06 
 0.999541624911717 

不要懷疑自己的眼睛,這次執行的結果和上一次差異非常大,並不是程式寫錯,而是因為宣告成 Single 變數會造成比宣告成 Double 變數還要嚴重很多的捨入誤差,因此第二次的執行結果才是正確的,也許有網友會問說:如何知道第二次的執行結果正確?因為我使用其他的數值軟體(MATLAB)來作,也是那樣的結果,所以應該是沒問題的。

捨入誤差是在利用電腦做數值運算時一定會遇到的問題,本站很多的程式都是預設宣告成Single,在這建議網友在真正使用時最好改宣告成 Double,雖然這樣會佔用較多的記憶體以及較長的執行時間 (通常感覺不出來),但是為了獲得更正確的結果,我想這麼作是值得的。如果需要比宣告成 Double 更小的捨入誤差,可以使用更精確的 Decimal 資料型態,但是一般使用 Double 應該就足夠了。


[ 上一個 | 首頁 | 數值分析 | 下一個 ]

This page was written by Jaric on Nov. 21, 1998. All rights reserved.

Revised : Mar. 31 , 1999 .

Total pageview since 4/6/1999.