SHARE
TWEET

Untitled

a guest Aug 24th, 2019 109 Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
  1. {
  2.  "cells": [
  3.   {
  4.    "cell_type": "markdown",
  5.    "metadata": {},
  6.    "source": [
  7.     "1.Analytic Approach\n",
  8.     "Descriptive analytics\n",
  9.     "Descriptive analytics is a preliminary stage of data processing that creates a summary of historical data to yield useful information and possibly prepare the data for further analysis"
  10.    ]
  11.   },
  12.   {
  13.    "cell_type": "markdown",
  14.    "metadata": {},
  15.    "source": [
  16.     "2.Data Requirements\n",
  17.     "My latest 5 years emails, including sender, recipient, subject, sending time, etc"
  18.    ]
  19.   },
  20.   {
  21.    "cell_type": "code",
  22.    "execution_count": null,
  23.    "metadata": {},
  24.    "outputs": [],
  25.    "source": [
  26.     "3.Data Collection\n",
  27.     "using Excel + vba\n",
  28.     "\n",
  29.     "Sub GetSender()\n",
  30.     "\n",
  31.     "Dim myOlApp As Outlook.Application\n",
  32.     "Dim mpfInbox As Outlook.MAPIFolder\n",
  33.     "Dim obj As Outlook.MailItem\n",
  34.     "Dim myexApp As Excel.Application\n",
  35.     "Dim i As Integer\n",
  36.     "Set myOlApp = CreateObject(\"Outlook.Application\")\n",
  37.     "Set mpfInbox = myOlApp.GetNamespace(\"MAPI\").GetDefaultFolder(olFolderInbox)\n",
  38.     "Workbooks(\"Book1.xls\").Worksheets(\"my personal email\").Select\n",
  39.     "For i = mpfInbox.Items.Count To 1 Step -1\n",
  40.     "  If mpfInbox.Items(i).Class = olMail Then\n",
  41.     "    Set obj = mpfInbox.Items.Item(i)\n",
  42.     "    Cells(i, 1) = obj.SenderEmailAddress\n",
  43.     "    Cells(i, 2) = obj.SenderName\n",
  44.     "    Cells(i, 3) = obj.ReceiverEmailAddress\n",
  45.     "    Cells(i, 4) = obj.ReceiverName\n",
  46.     "    Cells(i, 5) = obj.Subject\n",
  47.     "  End If\n",
  48.     "  \n",
  49.     "    \n",
  50.     "Next i\n",
  51.     "End Sub\n"
  52.    ]
  53.   },
  54.   {
  55.    "cell_type": "code",
  56.    "execution_count": null,
  57.    "metadata": {},
  58.    "outputs": [],
  59.    "source": [
  60.     "4.Data Understanding and Preparation\n",
  61.     "\n",
  62.     "import pandas as pd\n",
  63.     "from pandas import Series, DataFrame\n",
  64.     "mails = DataFrame(pd.read_excel('data.xlsx'))\n",
  65.     "print mails\n",
  66.     "\n",
  67.     "pysqldf = lambda sql: sqldf(sql,globals())\n",
  68.     "sql1 = \"select sendername,count(*) from mails group by sendername\"\n",
  69.     "print(pysqldf(sql1))\n",
  70.     "sql2 = \"select receivername,count(*) from mails group by receivername\"\n",
  71.     "print(pysqldf(sql1))\n"
  72.    ]
  73.   },
  74.   {
  75.    "cell_type": "markdown",
  76.    "metadata": {},
  77.    "source": [
  78.     "5.Modeling and Evaluation\n",
  79.     "The top 5  receivers: tangzh,sunzc,zhaobx,tianliang,yinyue\n",
  80.     "The top 5 senders:tianliang,yinyue,hanhongyi,tangzh,guomeng\n",
  81.     "The answer to the question:\n",
  82.     "    From the point of view of Email communication, I have the closest relationship with whom?\n",
  83.     "    tangzh,tianliang,yinyue"
  84.    ]
  85.   }
  86.  ],
  87.  "metadata": {
  88.   "kernelspec": {
  89.    "display_name": "Python",
  90.    "language": "python",
  91.    "name": "conda-env-python-py"
  92.   },
  93.   "language_info": {
  94.    "codemirror_mode": {
  95.     "name": "ipython",
  96.     "version": 3
  97.    },
  98.    "file_extension": ".py",
  99.    "mimetype": "text/x-python",
  100.    "name": "python",
  101.    "nbconvert_exporter": "python",
  102.    "pygments_lexer": "ipython3",
  103.    "version": "3.6.7"
  104.   }
  105.  },
  106.  "nbformat": 4,
  107.  "nbformat_minor": 4
  108. }
RAW Paste Data
We use cookies for various purposes including analytics. By continuing to use Pastebin, you agree to our use of cookies as described in the Cookies Policy. OK, I Understand
 
Top