Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- {
- "cells": [
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "# 对产品列计数:查看产品销售量"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 194,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "<div>\n",
- "<style scoped>\n",
- " .dataframe tbody tr th:only-of-type {\n",
- " vertical-align: middle;\n",
- " }\n",
- "\n",
- " .dataframe tbody tr th {\n",
- " vertical-align: top;\n",
- " }\n",
- "\n",
- " .dataframe thead th {\n",
- " text-align: right;\n",
- " }\n",
- "</style>\n",
- "<table border=\"1\" class=\"dataframe\">\n",
- " <thead>\n",
- " <tr style=\"text-align: right;\">\n",
- " <th></th>\n",
- " <th>姓名</th>\n",
- " <th>产品</th>\n",
- " <th>售价</th>\n",
- " </tr>\n",
- " </thead>\n",
- " <tbody>\n",
- " <tr>\n",
- " <th>0</th>\n",
- " <td>张三</td>\n",
- " <td>A</td>\n",
- " <td>20</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>1</th>\n",
- " <td>李四</td>\n",
- " <td>B</td>\n",
- " <td>30</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>2</th>\n",
- " <td>王五</td>\n",
- " <td>B</td>\n",
- " <td>30</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>3</th>\n",
- " <td>赵六</td>\n",
- " <td>A</td>\n",
- " <td>20</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>4</th>\n",
- " <td>小明</td>\n",
- " <td>A</td>\n",
- " <td>20</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>5</th>\n",
- " <td>小红</td>\n",
- " <td>C</td>\n",
- " <td>50</td>\n",
- " </tr>\n",
- " </tbody>\n",
- "</table>\n",
- "</div>"
- ],
- "text/plain": [
- " 姓名 产品 售价\n",
- "0 张三 A 20\n",
- "1 李四 B 30\n",
- "2 王五 B 30\n",
- "3 赵六 A 20\n",
- "4 小明 A 20\n",
- "5 小红 C 50"
- ]
- },
- "execution_count": 194,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "import pandas as pd\n",
- "df=pd.DataFrame({\"姓名\":['张三','李四','王五','赵六','小明','小红'],\"产品\":['A','B','B','A','A','C'],\"售价\":[20,30,30,20,20,50]})\n",
- "df"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "# 方法①:\n",
- "\n",
- "(1)按照分组条件[\"产品\"] (后面的[\"产品\"]仅仅为了对单列计数),增加一列出现次数'count'\n",
- "\n",
- "(2)按照分组条件[\"产品\"]去重。\n",
- "\n",
- "(3)重置索引"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 72,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "<div>\n",
- "<style scoped>\n",
- " .dataframe tbody tr th:only-of-type {\n",
- " vertical-align: middle;\n",
- " }\n",
- "\n",
- " .dataframe tbody tr th {\n",
- " vertical-align: top;\n",
- " }\n",
- "\n",
- " .dataframe thead th {\n",
- " text-align: right;\n",
- " }\n",
- "</style>\n",
- "<table border=\"1\" class=\"dataframe\">\n",
- " <thead>\n",
- " <tr style=\"text-align: right;\">\n",
- " <th></th>\n",
- " <th>产品</th>\n",
- " <th>姓名</th>\n",
- " <th>售价</th>\n",
- " <th>count</th>\n",
- " </tr>\n",
- " </thead>\n",
- " <tbody>\n",
- " <tr>\n",
- " <th>0</th>\n",
- " <td>A</td>\n",
- " <td>张三</td>\n",
- " <td>20</td>\n",
- " <td>3</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>1</th>\n",
- " <td>B</td>\n",
- " <td>李四</td>\n",
- " <td>30</td>\n",
- " <td>2</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>2</th>\n",
- " <td>B</td>\n",
- " <td>王五</td>\n",
- " <td>30</td>\n",
- " <td>2</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>3</th>\n",
- " <td>A</td>\n",
- " <td>赵六</td>\n",
- " <td>20</td>\n",
- " <td>3</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>4</th>\n",
- " <td>A</td>\n",
- " <td>小明</td>\n",
- " <td>20</td>\n",
- " <td>3</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>5</th>\n",
- " <td>C</td>\n",
- " <td>小红</td>\n",
- " <td>50</td>\n",
- " <td>1</td>\n",
- " </tr>\n",
- " </tbody>\n",
- "</table>\n",
- "</div>"
- ],
- "text/plain": [
- " 产品 姓名 售价 count\n",
- "0 A 张三 20 3\n",
- "1 B 李四 30 2\n",
- "2 B 王五 30 2\n",
- "3 A 赵六 20 3\n",
- "4 A 小明 20 3\n",
- "5 C 小红 50 1"
- ]
- },
- "execution_count": 72,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df[\"count\"]=df.groupby([\"产品\"])['产品'].transform(len)\n",
- "df"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 73,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "<div>\n",
- "<style scoped>\n",
- " .dataframe tbody tr th:only-of-type {\n",
- " vertical-align: middle;\n",
- " }\n",
- "\n",
- " .dataframe tbody tr th {\n",
- " vertical-align: top;\n",
- " }\n",
- "\n",
- " .dataframe thead th {\n",
- " text-align: right;\n",
- " }\n",
- "</style>\n",
- "<table border=\"1\" class=\"dataframe\">\n",
- " <thead>\n",
- " <tr style=\"text-align: right;\">\n",
- " <th></th>\n",
- " <th>产品</th>\n",
- " <th>姓名</th>\n",
- " <th>售价</th>\n",
- " <th>count</th>\n",
- " </tr>\n",
- " </thead>\n",
- " <tbody>\n",
- " <tr>\n",
- " <th>0</th>\n",
- " <td>A</td>\n",
- " <td>张三</td>\n",
- " <td>20</td>\n",
- " <td>3</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>1</th>\n",
- " <td>B</td>\n",
- " <td>李四</td>\n",
- " <td>30</td>\n",
- " <td>2</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>2</th>\n",
- " <td>C</td>\n",
- " <td>小红</td>\n",
- " <td>50</td>\n",
- " <td>1</td>\n",
- " </tr>\n",
- " </tbody>\n",
- "</table>\n",
- "</div>"
- ],
- "text/plain": [
- " 产品 姓名 售价 count\n",
- "0 A 张三 20 3\n",
- "1 B 李四 30 2\n",
- "2 C 小红 50 1"
- ]
- },
- "execution_count": 73,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df=df.drop_duplicates(\"产品\")\n",
- "df=df.reset_index(drop=True)\n",
- "df"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "# 方法②\n",
- "\n",
- "### 1统计[\"产品\"]个数(如下四种方法)\n",
- "(1)size\n",
- "\n",
- "(2)value_counts\n",
- "\n",
- "(3)count \n",
- "\n",
- "(4)agg(\"count\")"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 257,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "产品\n",
- "A 3\n",
- "B 2\n",
- "C 1\n",
- "dtype: int64"
- ]
- },
- "execution_count": 257,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df1=df.groupby(['产品']).size() \n",
- "df1"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 253,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "A 3\n",
- "B 2\n",
- "C 1\n",
- "Name: 产品, dtype: int64"
- ]
- },
- "execution_count": 253,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df1=df[\"产品\"].value_counts()\n",
- "df1"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 250,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "产品\n",
- "A 3\n",
- "B 2\n",
- "C 1\n",
- "Name: 姓名, dtype: int64"
- ]
- },
- "execution_count": 250,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df1=df.groupby(['产品'])[\"姓名\"].count() \n",
- "df1"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 262,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "产品\n",
- "A 3\n",
- "B 2\n",
- "C 1\n",
- "Name: 姓名, dtype: int64"
- ]
- },
- "execution_count": 262,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df1=df.groupby(['产品'])[\"姓名\"].agg(\"count\")\n",
- "df1"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### 2整理成表格(如下三种方法)\n",
- "(1)去掉[\"产品\"]索引+增加最后一行列名\n",
- "\n",
- "(2)去索引+改列名\n",
- "\n",
- "(2)使结果变成表格形式+去索引+改列名"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 258,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "<div>\n",
- "<style scoped>\n",
- " .dataframe tbody tr th:only-of-type {\n",
- " vertical-align: middle;\n",
- " }\n",
- "\n",
- " .dataframe tbody tr th {\n",
- " vertical-align: top;\n",
- " }\n",
- "\n",
- " .dataframe thead th {\n",
- " text-align: right;\n",
- " }\n",
- "</style>\n",
- "<table border=\"1\" class=\"dataframe\">\n",
- " <thead>\n",
- " <tr style=\"text-align: right;\">\n",
- " <th></th>\n",
- " <th>产品</th>\n",
- " <th>count</th>\n",
- " </tr>\n",
- " </thead>\n",
- " <tbody>\n",
- " <tr>\n",
- " <th>0</th>\n",
- " <td>A</td>\n",
- " <td>3</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>1</th>\n",
- " <td>B</td>\n",
- " <td>2</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>2</th>\n",
- " <td>C</td>\n",
- " <td>1</td>\n",
- " </tr>\n",
- " </tbody>\n",
- "</table>\n",
- "</div>"
- ],
- "text/plain": [
- " 产品 count\n",
- "0 A 3\n",
- "1 B 2\n",
- "2 C 1"
- ]
- },
- "execution_count": 258,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df1=df1.reset_index(name='count') #去掉[\"产品\"]索引+增加最后一行列名\n",
- "df1"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 265,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "<div>\n",
- "<style scoped>\n",
- " .dataframe tbody tr th:only-of-type {\n",
- " vertical-align: middle;\n",
- " }\n",
- "\n",
- " .dataframe tbody tr th {\n",
- " vertical-align: top;\n",
- " }\n",
- "\n",
- " .dataframe thead th {\n",
- " text-align: right;\n",
- " }\n",
- "</style>\n",
- "<table border=\"1\" class=\"dataframe\">\n",
- " <thead>\n",
- " <tr style=\"text-align: right;\">\n",
- " <th></th>\n",
- " <th>产品</th>\n",
- " <th>count</th>\n",
- " </tr>\n",
- " </thead>\n",
- " <tbody>\n",
- " <tr>\n",
- " <th>0</th>\n",
- " <td>A</td>\n",
- " <td>3</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>1</th>\n",
- " <td>B</td>\n",
- " <td>2</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>2</th>\n",
- " <td>C</td>\n",
- " <td>1</td>\n",
- " </tr>\n",
- " </tbody>\n",
- "</table>\n",
- "</div>"
- ],
- "text/plain": [
- " 产品 count\n",
- "0 A 3\n",
- "1 B 2\n",
- "2 C 1"
- ]
- },
- "execution_count": 265,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df1.reset_index([\"产品\"]).rename({\"姓名\":\"count\"},axis=1)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 261,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "<div>\n",
- "<style scoped>\n",
- " .dataframe tbody tr th:only-of-type {\n",
- " vertical-align: middle;\n",
- " }\n",
- "\n",
- " .dataframe tbody tr th {\n",
- " vertical-align: top;\n",
- " }\n",
- "\n",
- " .dataframe thead th {\n",
- " text-align: right;\n",
- " }\n",
- "</style>\n",
- "<table border=\"1\" class=\"dataframe\">\n",
- " <thead>\n",
- " <tr style=\"text-align: right;\">\n",
- " <th></th>\n",
- " <th>count</th>\n",
- " <th>姓名</th>\n",
- " </tr>\n",
- " </thead>\n",
- " <tbody>\n",
- " <tr>\n",
- " <th>0</th>\n",
- " <td>A</td>\n",
- " <td>3</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>1</th>\n",
- " <td>B</td>\n",
- " <td>2</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>2</th>\n",
- " <td>C</td>\n",
- " <td>1</td>\n",
- " </tr>\n",
- " </tbody>\n",
- "</table>\n",
- "</div>"
- ],
- "text/plain": [
- " count 姓名\n",
- "0 A 3\n",
- "1 B 2\n",
- "2 C 1"
- ]
- },
- "execution_count": 261,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df1=df1.to_frame()\n",
- "df1.reset_index().rename({\"index\":\"产品\", \"产品\":\"count\"},axis=1)"
- ]
- }
- ],
- "metadata": {
- "kernelspec": {
- "display_name": "Python 3",
- "language": "python",
- "name": "python3"
- },
- "language_info": {
- "codemirror_mode": {
- "name": "ipython",
- "version": 3
- },
- "file_extension": ".py",
- "mimetype": "text/x-python",
- "name": "python",
- "nbconvert_exporter": "python",
- "pygments_lexer": "ipython3",
- "version": "3.7.0"
- }
- },
- "nbformat": 4,
- "nbformat_minor": 2
- }
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement