Advertisement
Guest User

Untitled

a guest
Oct 20th, 2019
1,086
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 4.74 KB | None | 0 0
  1. Description of the German credit dataset.
  2.  
  3. 1. Title: German Credit data
  4.  
  5. 2. Source Information
  6.  
  7. Professor Dr. Hans Hofmann
  8. Institut f"ur Statistik und "Okonometrie
  9. Universit"at Hamburg
  10. FB Wirtschaftswissenschaften
  11. Von-Melle-Park 5
  12. 2000 Hamburg 13
  13.  
  14. 3. Number of Instances: 1000
  15.  
  16. Two datasets are provided. the original dataset, in the form provided
  17. by Prof. Hofmann, contains categorical/symbolic attributes and
  18. is in the file "german.data".
  19.  
  20. For algorithms that need numerical attributes, Strathclyde University
  21. produced the file "german.data-numeric". This file has been edited
  22. and several indicator variables added to make it suitable for
  23. algorithms which cannot cope with categorical variables. Several
  24. attributes that are ordered categorical (such as attribute 17) have
  25. been coded as integer. This was the form used by StatLog.
  26.  
  27.  
  28. 6. Number of Attributes german: 20 (7 numerical, 13 categorical)
  29. Number of Attributes german.numer: 24 (24 numerical)
  30.  
  31.  
  32. 7. Attribute description for german
  33.  
  34. Attribute 1: (qualitative)
  35. Status of existing checking account
  36. A11 : ... < 0 DM
  37. A12 : 0 <= ... < 200 DM
  38. A13 : ... >= 200 DM /
  39. salary assignments for at least 1 year
  40. A14 : no checking account
  41.  
  42. Attribute 2: (numerical)
  43. Duration in month
  44.  
  45. Attribute 3: (qualitative)
  46. Credit history
  47. A30 : no credits taken/
  48. all credits paid back duly
  49. A31 : all credits at this bank paid back duly
  50. A32 : existing credits paid back duly till now
  51. A33 : delay in paying off in the past
  52. A34 : critical account/
  53. other credits existing (not at this bank)
  54.  
  55. Attribute 4: (qualitative)
  56. Purpose
  57. A40 : car (new)
  58. A41 : car (used)
  59. A42 : furniture/equipment
  60. A43 : radio/television
  61. A44 : domestic appliances
  62. A45 : repairs
  63. A46 : education
  64. A47 : (vacation - does not exist?)
  65. A48 : retraining
  66. A49 : business
  67. A410 : others
  68.  
  69. Attribute 5: (numerical)
  70. Credit amount
  71.  
  72. Attibute 6: (qualitative)
  73. Savings account/bonds
  74. A61 : ... < 100 DM
  75. A62 : 100 <= ... < 500 DM
  76. A63 : 500 <= ... < 1000 DM
  77. A64 : .. >= 1000 DM
  78. A65 : unknown/ no savings account
  79.  
  80. Attribute 7: (qualitative)
  81. Present employment since
  82. A71 : unemployed
  83. A72 : ... < 1 year
  84. A73 : 1 <= ... < 4 years
  85. A74 : 4 <= ... < 7 years
  86. A75 : .. >= 7 years
  87.  
  88. Attribute 8: (numerical)
  89. Installment rate in percentage of disposable income
  90.  
  91. Attribute 9: (qualitative)
  92. Personal status and sex
  93. A91 : male : divorced/separated
  94. A92 : female : divorced/separated/married
  95. A93 : male : single
  96. A94 : male : married/widowed
  97. A95 : female : single
  98.  
  99. Attribute 10: (qualitative)
  100. Other debtors / guarantors
  101. A101 : none
  102. A102 : co-applicant
  103. A103 : guarantor
  104.  
  105. Attribute 11: (numerical)
  106. Present residence since
  107.  
  108. Attribute 12: (qualitative)
  109. Property
  110. A121 : real estate
  111. A122 : if not A121 : building society savings agreement/
  112. life insurance
  113. A123 : if not A121/A122 : car or other, not in attribute 6
  114. A124 : unknown / no property
  115.  
  116. Attribute 13: (numerical)
  117. Age in years
  118.  
  119. Attribute 14: (qualitative)
  120. Other installment plans
  121. A141 : bank
  122. A142 : stores
  123. A143 : none
  124.  
  125. Attribute 15: (qualitative)
  126. Housing
  127. A151 : rent
  128. A152 : own
  129. A153 : for free
  130.  
  131. Attribute 16: (numerical)
  132. Number of existing credits at this bank
  133.  
  134. Attribute 17: (qualitative)
  135. Job
  136. A171 : unemployed/ unskilled - non-resident
  137. A172 : unskilled - resident
  138. A173 : skilled employee / official
  139. A174 : management/ self-employed/
  140. highly qualified employee/ officer
  141.  
  142. Attribute 18: (numerical)
  143. Number of people being liable to provide maintenance for
  144.  
  145. Attribute 19: (qualitative)
  146. Telephone
  147. A191 : none
  148. A192 : yes, registered under the customers name
  149.  
  150. Attribute 20: (qualitative)
  151. foreign worker
  152. A201 : yes
  153. A202 : no
  154.  
  155.  
  156.  
  157. 8. Cost Matrix
  158.  
  159. This dataset requires use of a cost matrix (see below)
  160.  
  161.  
  162. 1 2
  163. ----------------------------
  164. 1 0 1
  165. -----------------------
  166. 2 5 0
  167.  
  168. (1 = Good, 2 = Bad)
  169.  
  170. the rows represent the actual classification and the columns
  171. the predicted classification.
  172.  
  173. It is worse to class a customer as good when they are bad (5),
  174. than it is to class a customer as bad when they are good (1).
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement