Guest User

Untitled

a guest
Jan 6th, 2023
503
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 15.62 KB | None | 0 0
  1. # -*- coding: utf-8 -*-
  2. """
  3. Created on Thu Dec 22 15:59:42 2022
  4.  
  5. @author: 12089
  6. """
  7. #print(df.columns.tolist())
  8. import pandas
  9. import numpy as np
  10. import pandas as pd
  11. import matplotlib.pyplot as plt
  12. from sklearn.decomposition import FactorAnalysis
  13. from sklearn.impute import SimpleImputer
  14. from sklearn.decomposition import PCA
  15. from factor_analyzer.factor_analyzer import FactorAnalyzer
  16.  
  17. #pd.set_option('max_rows', None)
  18. #pd.set_option('max_columns', None)
  19.  
  20. #df.groupby('poly')['sexrate'].mean().sort_index(ascending=False)
  21. #df['poly'].corr(df['sexrate'])
  22.  
  23. df = pd.read_csv(data.csv')
  24. df = df.dropna(how='all', axis=1)
  25.  
  26. df = df.drop(labels=['fight','openness','friendship','jealousy','codependent','healthy','sex', 'mismatch',
  27. ],axis='columns')
  28.  
  29. df = df.loc[~df['Time Finished (UTC)'].isna()]
  30.  
  31.  
  32. bad_sex = (df['sexrate'] <= 1 )&(df['sex2'] <= -3)
  33. df.loc[~bad_sex]
  34. df = df.loc[~bad_sex]
  35.  
  36. bad_health = (df['healthy6'] >= 3 )&(df['healthy3'] >= 3)
  37. df.loc[~bad_health]
  38. df = df.loc[~bad_health]
  39.  
  40. bad_sexrate = (df['sexrate'] <= -1 )&(df['healthy3'] >= 35)
  41. df.loc[~bad_health]
  42. df = df.loc[~bad_health]
  43.  
  44. bad_age = (df['selfage'] >= 91)
  45. df.loc[~bad_age]
  46. df = df.loc[~bad_age]
  47.  
  48. bad_age = (df['partnerage'] >= 92)
  49. df.loc[~bad_age]
  50. df = df.loc[~bad_age]
  51.  
  52. bad_age = (df['partnerage'] <= 15)
  53. df.loc[~bad_age]
  54. df = df.loc[~bad_age]
  55.  
  56. bad_age = (df['selfage'] <= 16)
  57. df.loc[~bad_age]
  58. df = df.loc[~bad_age]
  59.  
  60. bad_income = (df['income'] >= 1500000)
  61. df.loc[~bad_income]
  62. df = df.loc[~bad_income]
  63.  
  64. df = df.drop(labels=['Socially speaking, you tend to be more (ih64ovk)','Run','Program Version','User','Time Started (UTC)','Position','Points',
  65. 'Your age? (rkkox57)','Which category fits you best? (gpaq7lv)','Economically speaking, you tend to be more (2wpg7cg)',
  66. 'Do you practice a traditional religion? (qj2numz)','Are you more monogamous or polyamorous? (30c17ya)','Is this your first time submitting the survey, or are you resubmitting it in order to answer for additional partners? (qmf1d3w)',
  67. 'In a world where your partner was fully aware and deeply okay with it, how much would you be interested in having sexual/romantic experiences with people besides your partner? (ao3mcdk)','In a world where you were fully aware and deeply okay with it, how much would *your partner* be interested in having sexual/romantic experiences with people besides you? (wcq3vrx)',
  68. 'How long have you been in a relationship with this person? (kh74yju)','To get a little more specific, how long have you been in a relationship with this person? (wqx272y)','How old is your partner? (us6u6co)','To get a little more specific, how long have you been in a relationship with this person? (beugzym)','Which category fits your partner best? (u9jccbo)','Are you married to your partner? (pfqs9ad)',
  69. 'Do you have children with your partner? (qgjf1nu)','On average, over the last six months, about how often do you and your partner have sex? (3pja5nd)','On average, over the last six months, about how often do you watch porn or consume erotic content for the purposes of arousal? (vnw3xxz)','To get a little more specific, how long have you been in a relationship with this person? (tl3n25m)','How often do you and your partner have a fight? (x6jw4sp)','agreescale',
  70. 'Randomize (hk79de0)','"It’s hard to imagine being happy without this relationship." (6u0bje)','"I have no secrets from my partner" (bgassjt)','"If my partner and I ever split up, it would be a logistical nightmare (e.g., separating house, friends) (e1claef)',
  71. '"If my relationship ended I would be absolutely devastated" (2ytl03s)','To get a little more specific, how long have you been in a relationship with this person? (l0f4zc7)','"I sometimes worry that my partner will leave me for someone better" (xkjzgym)',
  72. '"My relationship is playful" (w2uykq1)','"My partner an I are politically aligned" (12ycrs5)','"We have compatible humor" (o9empfe)',""""The long-term routines and structure of my life are intertwined with my partner's" (li0toxk)""",""""The passion in this relationship is deeply intense" (gwzrhth)""",'"I share the same hobbies with my partner" (89hl8ys)',
  73. """"My relationship causes me grief or sorrow" (rm0dtr6)""",""""In hindsight, getting into this relationship was a bad idea" (1y6wfih)""",""""I feel like I would still be a desirable mate even if my partner left me" (qboob7y)""",""""I often feel jealousy in my relationship" (kfcicm9)""",""""I think this relationship will last for a very long time" (ob8595u)""",
  74. """"My partner enables me to learn and grow" (e2oy448)""",""""My partner doesn't excite me" (6fcm06c)""",""""My partner doesn't sexually fulfill me" (xxf5wfc)""",""""I rely on my partner for a sense of self worth" (j0nv7n9)""",""""My partner and I handle fights well" (brtsa94)""",""""I feel confident in my relationship's ability to withstand everything life has to throw at us" (p81ekto)""",""""My partner is vital to my sense of self worth" (bvgpdw1)""",
  75. """"I sometimes fear my partner" (a21v31h)""",""""I try to stay aware of my partner's potential infidelity" (5qbgizc)""", """"I share my thoughts and opinions with my partner" (6lwugp9)""",
  76. """"This relationship is good for me" (wko8n8m)""",""""My partner takes priority over everything else in my life" (2sslsr1)""",""""We respect each other" (c39vvrk)""",""""My partner is more concerned with being right than with getting along" (rlkw670)""",""""I am more needy than my partner" (f3or362)""",""""I feel emotionally safe with my partner" (or9gg0a)""",""""I'm satisfied with our sex life" (6g14ks)""",""""My partner physically desires me" (kh7ppyp)""",
  77. """"My partner and I feel comfortable explicitly discussing our relationship on a meta level" (jrzzb06)""",""""My partner knows all my sexual fantasies" (s3cgjd2)""",""""My partner and I are intellectually matched" (ku1vm67)""",""""I am careful to maintain a personal identity separate from my partner" (u5esujt)""",""""I'm worried I'm not good enough for my partner" (45rohqq)""",""""My partner judges me" (fr4mr4a)""","""Did you answer this survey honestly/for a real partner? (7bfie2v)""",
  78. 'Adolf_Hitler_and_Eva_Braun__history','Allie_and_Noah__The_Notebook','Andy_Dwyer_and_April_Ludgate__Parks_and_Rec','Angel_and_Buffy__Buffy_the_Vampire_Slayer','Antony_and_Cleopatra__history','Aragorn_and_Arwen__Lord_of_the_Rings','Baby_and_Johnny__Dirty_Dancing','Belle_and_the_Beast__Beauty_and_the_Beast','Bert_and_Ernie__Sesame_Street','Bob_and_Linda__Bob_s_Burgers','Bonnie_and_Clyde__history','Carrie_and_Big__Sex_and_the_City','Chrisitan_and_Anastasia__Fifty_Shades_of_Grey',
  79. 'Dani_and_Christian__Midsommar','Edward_and_Bella__Twilight','Elizabeth_Bennett_and_Mr_Darcy__Pride_and_Prejudice','Finn_and_Flame_Princess__Adventure_Time','Forrest_and_Jenny__Forrest_Gump','Gomez_and_Morticia_Addams__The_Addams_Family','Han_and_Leia__Star_Wars','Harley_Quinn_and_The_Joker__Batman','Harry_and_Draco__Harry_Potter_Fanfiction','Hazel_and_Agustus__The_Fault_In_Our_Stars','Homer_and_Marge_Simpson__The_Simpsons','Jack_and_Rose__The_Titanic','Jasmine_and_Aladdin__Aladdin',
  80. 'John_and_Jackie_Kennedy__history','Johnny_and_Lisa__The_Room','Kelly_and_Ryan__The_Office','Kermit_and_Miss_Piggy__The_Muppets','Leonard_and_Penny__Big_Bang_Theory','Lily_and_Marshall__How_I_Met_Your_Mother','Lucy_and_Ricky__I_Love_Lucy',
  81. 'Mickey_and_Minnie__Disney','Monica_and_Chandler__friends','Mr_Incredible_and_Elastigirl__The_Incredibles','Pam_and_Jim__The_Office','Piper_and_Alex__Orange_Is_The_New_Black','Queen_Elizabeth_II_and_Prince_Philip_Duke_of_Edinburgh__history',
  82. 'Romeo_and_Juliette__Shakespeare','Ross_and_Rachel__Friends','Shrek_and_Fiona__Shrek','Spongebob_and_Patrick__Spongebob_Squarepants','Tarzan_and_Jane__Tarzan','The_Pope_and_his_hand','Tobias_and_Linsay__Arrested_Development',
  83. 'Westley_and_Buttercup__Princess_Bride','Zoe_and_Wash__Firefly','characterNames','scoresForEachCharacter','smallestDistance','jankydistances','characterNumber','characterScores','characterName',
  84. 'distance','distances','sortednumbers','item','Roughly speaking, what is your yearly income in USD? (mzfpanh)','earlyAverages','unsortedDistances','fbclid','On average, over the last six months, about how often do you and your partner have sex? (n1iblql)','Randomize (mmsj06g)',
  85. 's','Was this your first time taking the survey? (xlmp5n)',
  86. ],axis='columns')
  87.  
  88. bad_honesty = (df['honesty'] <= 1)
  89. df.loc[~bad_honesty]
  90. df = df.loc[~bad_honesty]
  91.  
  92. df['firsttime'] = df['Was this your first time taking the survey? (yg1agly)'].replace({'Yes':1,'No':0})
  93. df['surveysource'] = df['You got to this survey from: (mpjxl34)']
  94.  
  95. bad_time = (df['firsttime'] <= 0)
  96. df.loc[~bad_time]
  97. df = df.loc[~bad_time]
  98.  
  99. df['timetillmarriage'] = df['How long did you spend in a romantic relationship with your partner before getting married? (anthw9o)'].replace({
  100. '0-3 months':2, "4-6 months":5,"7-9 months":8,"10-12 months":9,"1-1.5 years":15, "1.5-2 years":21,'2-3 years':30, '3-4 years':42, '4-5 years':54, '5-6 years':66, '7-9 years':96, '10-14 years':144, '15-19 years':204, '20+ years':252, '5-6 years | 5-6 years':66})
  101.  
  102. df['timetillchild'] = df['How long had you been in a romantic relationship with your partner when you had your first child? (qxwjbzq)'].replace({
  103. '0-3 months':2, "4-6 months":5,"7-9 months":8,"10-12 months":9,"1-1.5 years":15, "1.5-2 years":21,'2-3 years':30, '3-4 years':42, '4-5 years':54, '5-6 years':66, '7-9 years':96, '10-14 years':144, '15-19 years':204, '20+ years':252})
  104.  
  105. m = df['timetillchild']
  106. mmm = m[~(m.str.contains("|", regex=False).fillna(False))].astype(float)
  107. df['timetillchild'] = mmm
  108.  
  109. df['length'] = df['length'].replace({'218':228,'220':240})
  110.  
  111. df['agegap'] = df['selfage'] - df['partnerage']
  112.  
  113.  
  114.  
  115.  
  116.  
  117. df['jealousy5b'] = df['jealousy5b'].fillna(0)
  118. df['jealousy5'] = df['jealousy5'].fillna(0)
  119. df['codependent4'] = df['codependent4'].fillna(0)
  120. df['codependent4b'] = df['codependent4b'].fillna(0)
  121.  
  122.  
  123. df['fight'] = df['fight1'] + df['fight2'] + df['fight3'] + df['fight4'] - df['fight5'] - df['fight6']
  124. df['openness'] = df['openness1'] + df['openness2'] + df['openness3'] + df['openness4'] + df['openness5'] + df['openness6']
  125. df['friendship'] = df['friendship1'] + df['friendship2'] + df['friendship3'] + df['friendship4'] + df['friendship5'] + df['friendship6']
  126. df['jealousy'] = df['jealousy1'] + df['jealousy2'] + df['jealousy3'] - df['jealousy4'] + df['jealousy5'] + df['jealousy5b'] + df['jealousy6']
  127. df['codependent'] = df['codependent1'] + df['codependent2'] + df['codependent3'] + df['codependent4b'] + df['codependent4'] + df['codependent5'] + df['codependent6']
  128. df['healthy'] = df['healthy1'] + df['healthy2'] + df['healthy3'] + df['healthy4'] + df['healthy5'] - df['healthy6']
  129. df['sex'] = df['sex1'] - df['sex2'] + df['sex3'] + df['sex4'] + df['sex5'] - df['sex6']
  130. df['mismatch'] = df['mismatch1'] + df['mismatch2'] - df['mismatch3'] + df['mismatch4'] + df['mismatch5'] + df['mismatch6']
  131.  
  132. bad_sexr = (df['sexrate'] <=-1)
  133. df.loc[~bad_sexr]
  134. df = df.loc[~bad_sexr]
  135. df['sexrate'] = df['sexrate'].replace({0:0.1})
  136. df['marriedadj'] = df['married'].replace({1:0})
  137. df['ageadj'] = (df['selfage'] -16)
  138. df['lengthadj'] = (df['length'] / 12)
  139. df['percent'] = (df['lengthadj']/df['ageadj'] * 100)
  140. df['healthadj'] = (df['healthy']/df['length'] *100)
  141. df['sexpartnercountadj'] = df['sexpartnercount'].replace({125:82,175:82,220:82})
  142.  
  143. df['cheated'] = df['Have you or your partner ever cheated on each other? (hhf9b8h)']
  144.  
  145.  
  146. bad_perc = (df['percent'] >= 101)
  147. df.loc[~bad_perc]
  148. df = df.loc[~bad_perc]
  149.  
  150.  
  151. columns = ['fight1','fight2','fight3','fight4','fight5','fight6','openness1','openness2','openness3','openness4','openness5','openness6',
  152. 'friendship1','friendship2','friendship3','friendship4','friendship5','friendship6','jealousy1','jealousy2','jealousy3','jealousy4','jealousy5','jealousy5b',
  153. 'jealousy6','codependent1','codependent2','codependent3','codependent4b','codependent4','codependent5','codependent6',
  154. 'healthy1','healthy2','healthy3','healthy4','healthy5','healthy6','sex1',
  155. 'sex2','sex3','sex4','sex5','sex6','mismatch1','mismatch2','mismatch3','mismatch4','mismatch5','mismatch6']
  156. numeric_df = pd.DataFrame(df, columns=columns)
  157.  
  158. numeric_df = numeric_df.select_dtypes(include='number')
  159. imputer = SimpleImputer(missing_values=np.nan, strategy='mean')
  160. fa = FactorAnalysis(svd_method='lapack')
  161. #fa = FactorAnalyzer(rotation='oblimin')
  162. numeric_df_imputed = imputer.fit_transform(numeric_df)
  163. fa.fit(numeric_df_imputed)
  164. numeric_df_transformed = fa.transform(numeric_df_imputed)
  165.  
  166. factor_loadings = fa.components_
  167.  
  168. column_labels = list(numeric_df.columns)
  169.  
  170. # Create a pandas DataFrame from the factor loadings array
  171. factor_loadings_numeric_df = pd.DataFrame(factor_loadings, columns=column_labels)
  172.  
  173.  
  174. # Initialize a PCA model
  175. pca = PCA()
  176.  
  177. # Fit the PCA model to the data
  178. pca.fit(numeric_df_imputed)
  179.  
  180. # Get the explained variance for each principal component
  181. explained_variance = pca.explained_variance_
  182.  
  183. # Create a pandas DataFrame from the factor loadings array
  184. factor_loadings_numeric_df = pd.DataFrame(factor_loadings, columns=column_labels)
  185.  
  186. # Add a column for the explained variance
  187. factor_loadings_numeric_df["Explained Variance"] = explained_variance
  188.  
  189. # Sort the DataFrame by the explained variance
  190. factor_loadings_numeric_df.sort_values("Explained Variance", ascending=False, inplace=True)
  191.  
  192. factortransposed = factor_loadings_numeric_df.transpose()
  193.  
  194. numeric_dft = pd.DataFrame(numeric_df_transformed)
  195.  
  196.  
  197. # Define the bins
  198. bins = [0,3,6,9,12,18,24,36,48,60,72,84,108,144,192,540]
  199. binsshorter = [0,6,12,24,48,74,108,192,540]
  200. binsshortest = [0,12,48,144,540]
  201.  
  202. # Create a new column with the binned data
  203. df['binnedlength'] = pd.cut(df['length'], bins)
  204. df['binnedlengthshort'] = pd.cut(df['length'], binsshorter)
  205. df['binsshortest'] = pd.cut(df['length'], binsshortest)
  206.  
  207. # Find the average of sexrate for each bin
  208. df.groupby('binnedlength')['healthy'].mean()
  209. df.groupby('binnedlength')['healthy'].mean()
  210.  
  211. biomale = df.loc[df['biomale'] ==1]
  212.  
  213. cisbiomale = biomale.loc[biomale['cis'] ==1]
  214. straightcisbiomale = cisbiomale.loc[cisbiomale['biomalep']==0]
  215.  
  216. biofemale = df.loc[df['biomale'] ==0]
  217. cisbiofemale = biofemale.loc[df['cis'] ==1]
  218. straightcisbiofemale = cisbiofemale.loc[cisbiofemale['biomalep']==1]
  219.  
  220. nokids = df.loc[df['childcount'] ==0]
  221. yeskids = df.loc[df['childcount'] >=1]
  222. femalenokids = biofemale.loc[df['childcount'] ==0]
  223. femaleyeskids = biofemale.loc[df['childcount'] >=1]
  224. malenokidskids = biomale.loc[df['childcount'] ==0]
  225. maleyeskids = biomale.loc[df['childcount'] >=1]
  226.  
  227. twitter = df.loc[df['surveysource'] =="twitter"]
  228. other = df.loc[df['surveysource'] =="other"]
  229.  
  230. bins = [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100]
  231. bin_names = ['0-10', '10-20', '20-30', '30-40', '40-50', '50-60', '60-70', '70-80', '80-90', '90-100']
  232. biofemale['percent_binned'] = pd.cut(biofemale['percent'], bins, labels=bin_names)
  233.  
  234. biofemale.groupby('percent_binned')['sexrate'].mean().sort_index(ascending=True)
  235.  
  236. biomale['percent_binnedm'] = pd.cut(biomale['percent'], bins, labels=bin_names)
  237. biomale.groupby('percent_binnedm')['sexrate'].mean().sort_index(ascending=True)
  238.  
  239. grouped_dff = biofemale.groupby('binnedlength')
  240. grouped_dfm = biomale.groupby('binnedlength')
  241. grouped_dfsf = straightcisbiofemale.groupby('binnedlength')
  242. grouped_dfsm = straightcisbiomale.groupby('binnedlength')
  243.  
  244.  
  245.  
Add Comment
Please, Sign In to add comment