A website for self learning, collecting and sharing.
if __name__ == '__main__':
的作用和原理
gzip
模块
pypinyin
Ctrl + [
Ctrl + ]
Alt + Shift
: 竖列选择这种模式下只可以选择竖列,不可以随意插入光标。所以只限制于同一列且不间隔的情况下。Shift + Ctrl
: 竖列选择 Ctrl+Click
,选择多个编辑位点。这种模式下不仅可以选择竖列,同时还可以在多个地方插入光标。pip install -i https://pypi.tuna.tsinghua.edu.cn/simple some-package
pip install pip -U
pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple pip -U
def Combinatorial(n,i):
'''设计组合数'''
#n>=i
Min=min(i,n-i)
result=1
for j in range(0,Min):
#由于浮点数精度问题不能用//
result=result*(n-j)/(Min-j)
return result
if __name__ == '__main__':
print(int(Combinatorial(45,2)))
from scipy.special import comb, perm
#计算排列数
A=perm(3,2)
#计算组合数
C=comb(45,2)
print(A,C)
import math
def factorial_me(n):
'''建立求阶乘的函数'''
result = 1
for i in range(2, n + 1):
result = result * i
return result
def comb_1(n,m):
# 直接使用math里的阶乘函数计算组合数
return math.factorial(n)//(math.factorial(n-m)*math.factorial(m))
def comb_2(n,m):
# 使用自己的阶乘函数计算组合数
return factorial_me(n)//(factorial_me(n-m)*factorial_me(m))
def perm_1(n,m):
# 直接使用math里的阶乘函数计算排列数
return math.factorial(n)//math.factorial(n-m)
def perm_2(n,m):
# 使用自己的阶乘函数计算排列数
return factorial_me(n)//factorial_me(n-m)
if __name__ == '__main__':
print(factorial_me(6))
print(comb_1(45,2))
print(comb_2(45,2))
print(perm_1(45,2))
print(perm_2(45,2))
from itertools import combinations, permutations
# 列举排列结果[(1, 2), (1, 3), (2, 1), (2, 3), (3, 1), (3, 2)]
print(list(permutations([i for i in range(1,4)],2)))
#列举组合结果[(1, 2), (1, 3), (2, 3)]
print(list(combinations([1,2,3],2)))
factorial(n)
gamma(n+1)
v='n!'; vpa(v)
combntns(x,m) #列举出从n个元素中取出m个元素的组合,其中x是含有n个元素的向量
nchoosek(n,k) #从n个元素中取k个元素的所有组合数
nchoosek(x,m) #从向量x中取m个元素的组合
perms(x) #给出向量x的所有排列
prod(n:m) #求排列数:m*(m-1)*(m-2)*…*(n+1)*n (m>n)
prod(1:2:2n-1) #求(2n-1)!!
prod(2:2:2n) #求(2n)!!
prod(A) #对矩阵A的各列求积
prod(A,dim) #dim=1(默认); dim=2: 对矩阵A的各行求积(等价于(prod(A'))')
cumprod(n:m) #输出一个向量[n n*(n+1) n(n+1)(n+2) … n(n+1)(n+2)…(m-1)m]
cumprod(A) #A为矩阵, 输出同维数的矩阵,按列累积求积
cumprod(A,dim) #A为矩阵, dim=1(默认, 同上); dim=2: 按行累积求积
format rat
命令即可使输出结果转化为分数形式if __name__ == '__main__':
的作用和原理一个python文件通常有两种使用方法,第一是作为脚本直接执行,第二是 import
到其他的 python 脚本中被调用(模块重用)执行。因此 if __name__ == 'main':
的作用就是控制这两种情况执行代码的过程,在 if __name__ == 'main':
下的代码只有在第一种情况下(即文件作为脚本直接执行)才会被执行,而 import 到其他脚本中是不会被执行的。举例说明:新建 test.py
,内容如下:
# test.py
print('this is one')
if __name__ == '__main__':
print('this is two')
直接执行 test.py
,结果如下,可见 if __name__=="__main__":
语句之前和之后的代码都被执行。
this is one
this is two
下面尝试 import
执行。在同一文件夹新建名称为 import-test.py
的脚本,内容如下。执行之,结果仅为this is one
。
# import-test.py
import test
每个python模块(python文件,也就是此处的 test.py 和 import-test.py)都包含内置的变量 __name__
,当该模块被直接执行的时候,__name__
等于文件名(包含后缀.py
);如果该模块 import
到其他模块中,则该模块的 __name__
等于模块名称(不包含后缀.py
)。
而 __main__
始终指当前执行模块的名称(包含后缀.py
)。进而当模块被直接执行时,__name__ == '__main__'
结果为真。
为了进一步说明,我们在 test.py
脚本的 if __name__=="__main__":
之前加入 print(__name__)
,即将 __name__
打印出来。结果如下:直接执行时输出为__main__
,import
执行时输出为test
。
找到桌面或者开始菜单里的Python IDLE快捷方式,或者直接打开安装目录下的pythonw.exe。右击之,选择“属性”,在属性窗口中可对“起始位置”进行修改,即可更改默认文件打开/保存路径。
print()
函数详细语法: print(value, ..., sep='', end='\n', file=sys.stdout, flush=False)
from collections import Counter
ls = ['a', 'b', 'c', 'c', 'd', 'b', 'a', 'a', 'c', 'c']
counted_ls = Counter(ls)
sorted_ls_with_count = sorted(counted_ls.items(), key=lambda x: x[1], reverse=True)
my_dict = dict()
my_dict["name"] = "lowman"
my_dict["age"] = 26
my_dict["girl"] = "Tailand"
my_dict["money"] = 80
my_dict["hourse"] = None
for key, value in my_dict.items():
print(key, value)
输出:
money 80
girl Tailand
age 26
hourse None
name lowman
可以看见,遍历一个普通字典,返回的数据和定义字典时的字段顺序是不一致的。注意: Python3.6改写了dict的内部算法,Python3.6版本以后的dict是有序的,所以也就无须再关注dict顺序性的问题
import collections
my_order_dict = collections.OrderedDict()
my_order_dict["name"] = "lowman"
my_order_dict["age"] = 45
my_order_dict["money"] = 998
my_order_dict["hourse"] = None
for key, value in my_order_dict.items():
print(key, value)
输出:
name lowman
age 45
money 998
hourse None
有序字典可以按字典中元素的插入顺序来输出。注意: 有序字典的作用只是记住元素插入顺序并按顺序输出。如果有序字典中的元素一开始就定义好了,后面没有插入元素这一动作,那么遍历有序字典,其输出结果仍然是无序的,因为缺少了有序插入这一条件,所以此时有序字典就失去了作用,所以有序字典一般用于动态添加并需要按添加顺序输出的时候。
gzip
模块Python gzip module provides a very simple way to compress and decompress files and work in a similar manner to GNU programs gzip and gunzip.
import gzip
import io
import os
output_file_name = 'jd_example.txt.gz'
file_mode = 'wb'
with gzip.open(output_file_name, file_mode) as output:
with io.TextIOWrapper(output, encoding='utf-8') as encode:
encode.write('We can write anything in the file here.\n')
print(output_file_name,
'contains', os.stat(output_file_name).st_size, 'bytes')
os.system('file -b --mime {}'.format(output_file_name))
import gzip
import io
read_file_name = 'jd_example.txt.gz'
file_mode = 'rb'
with gzip.open(read_file_name, file_mode) as input_file:
with io.TextIOWrapper(input_file, encoding='utf-8') as dec:
print(dec.read())
# i = io.TextIOWrapper(gzip.open(input_gz, "rb"), encoding='utf-8')
# use this code will return an object "i" just like commom "open" does,
# so you can use "i.readline()" or "for line in i" and so on
https://machinelearningmastery.com/statistical-hypothesis-tests-in-python-cheat-sheet/
By Jason Brownlee on August 15, 2018 in Statistics
This section lists statistical tests that you can use to check if your data has a Gaussian distribution.
Tests whether a data sample has a Gaussian distribution.
Assumptions
Interpretation
Python Code
# Example of the Shapiro-Wilk Normality Test
from scipy.stats import shapiro
data = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]
stat, p = shapiro(data)
print('stat=%.3f, p=%.3f' % (stat, p))
if p > 0.05:
print('Probably Gaussian')
else:
print('Probably not Gaussian')
More Information
Tests whether a data sample has a Gaussian distribution.
Assumptions
Interpretation
Python Code
# Example of the D'Agostino's K^2 Normality Test
from scipy.stats import normaltest
data = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]
stat, p = normaltest(data)
print('stat=%.3f, p=%.3f' % (stat, p))
if p > 0.05:
print('Probably Gaussian')
else:
print('Probably not Gaussian')
More Information
Tests whether a data sample has a Gaussian distribution.
Assumptions
Interpretation
Python Code
# Example of the Anderson-Darling Normality Test
from scipy.stats import anderson
data = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]
result = anderson(data)
print('stat=%.3f' % (result.statistic))
for i in range(len(result.critical_values)):
sl, cv = result.significance_level[i], result.critical_values[i]
if result.statistic < cv:
print('Probably Gaussian at the %.1f%% level' % (sl))
else:
print('Probably not Gaussian at the %.1f%% level' % (sl))
More Information
This section lists statistical tests that you can use to check if two samples are related.
Tests whether two samples have a linear relationship.
Assumptions
Interpretation
Python Code
# Example of the Pearson's Correlation test
from scipy.stats import pearsonr
data1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]
data2 = [0.353, 3.517, 0.125, -7.545, -0.555, -1.536, 3.350, -1.578, -3.537, -1.579]
stat, p = pearsonr(data1, data2)
print('stat=%.3f, p=%.3f' % (stat, p))
if p > 0.05:
print('Probably independent')
else:
print('Probably dependent')
More Information
Tests whether two samples have a monotonic relationship.
Assumptions
Interpretation
Python Code
# Example of the Spearman's Rank Correlation Test
from scipy.stats import spearmanr
data1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]
data2 = [0.353, 3.517, 0.125, -7.545, -0.555, -1.536, 3.350, -1.578, -3.537, -1.579]
stat, p = spearmanr(data1, data2)
print('stat=%.3f, p=%.3f' % (stat, p))
if p > 0.05:
print('Probably independent')
else:
print('Probably dependent')
More Information
Tests whether two samples have a monotonic relationship.
Assumptions
Interpretation
Python Code
# Example of the Kendall's Rank Correlation Test
from scipy.stats import kendalltau
data1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]
data2 = [0.353, 3.517, 0.125, -7.545, -0.555, -1.536, 3.350, -1.578, -3.537, -1.579]
stat, p = kendalltau(data1, data2)
print('stat=%.3f, p=%.3f' % (stat, p))
if p > 0.05:
print('Probably independent')
else:
print('Probably dependent')
More Information
Tests whether two categorical variables are related or independent.
Assumptions
Interpretation
Python Code
# Example of the Chi-Squared Test
from scipy.stats import chi2_contingency
table = [[10, 20, 30],[6, 9, 17]]
stat, p, dof, expected = chi2_contingency(table)
print('stat=%.3f, p=%.3f' % (stat, p))
if p > 0.05:
print('Probably independent')
else:
print('Probably dependent')
More Information
This section lists statistical tests that you can use to check if a time series is stationary or not.
Tests whether a time series has a unit root, e.g. has a trend or more generally is autoregressive.
Assumptions
Interpretation
Python Code
# Example of the Augmented Dickey-Fuller unit root test
from statsmodels.tsa.stattools import adfuller
data = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
stat, p, lags, obs, crit, t = adfuller(data)
print('stat=%.3f, p=%.3f' % (stat, p))
if p > 0.05:
print('Probably not Stationary')
else:
print('Probably Stationary')
More Information
Tests whether a time series is trend stationary or not.
Assumptions
Interpretation
Python Code
# Example of the Kwiatkowski-Phillips-Schmidt-Shin test
from statsmodels.tsa.stattools import kpss
data = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
stat, p, lags, crit = kpss(data)
print('stat=%.3f, p=%.3f' % (stat, p))
if p > 0.05:
print('Probably Stationary')
else:
print('Probably not Stationary')
More Information
This section lists statistical tests that you can use to compare data samples.
Tests whether the means of two independent samples are significantly different.
Assumptions
Interpretation
Python Code
# Example of the Student's t-test
from scipy.stats import ttest_ind
data1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]
data2 = [1.142, -0.432, -0.938, -0.729, -0.846, -0.157, 0.500, 1.183, -1.075, -0.169]
stat, p = ttest_ind(data1, data2)
print('stat=%.3f, p=%.3f' % (stat, p))
if p > 0.05:
print('Probably the same distribution')
else:
print('Probably different distributions')
More Information
Tests whether the means of two paired samples are significantly different.
Assumptions
Interpretation
Python Code
# Example of the Paired Student's t-test
from scipy.stats import ttest_rel
data1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]
data2 = [1.142, -0.432, -0.938, -0.729, -0.846, -0.157, 0.500, 1.183, -1.075, -0.169]
stat, p = ttest_rel(data1, data2)
print('stat=%.3f, p=%.3f' % (stat, p))
if p > 0.05:
print('Probably the same distribution')
else:
print('Probably different distributions')
More Information
Tests whether the means of two or more independent samples are significantly different.
Assumptions
Interpretation
Python Code
# Example of the Analysis of Variance Test
from scipy.stats import f_oneway
data1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]
data2 = [1.142, -0.432, -0.938, -0.729, -0.846, -0.157, 0.500, 1.183, -1.075, -0.169]
data3 = [-0.208, 0.696, 0.928, -1.148, -0.213, 0.229, 0.137, 0.269, -0.870, -1.204]
stat, p = f_oneway(data1, data2, data3)
print('stat=%.3f, p=%.3f' % (stat, p))
if p > 0.05:
print('Probably the same distribution')
else:
print('Probably different distributions')
More Information
Tests whether the means of two or more paired samples are significantly different.
Assumptions
Interpretation
Python Code
Currently not supported in Python.
More Information
Tests whether the distributions of two independent samples are equal or not.
Assumptions
Interpretation
Python Code
# Example of the Mann-Whitney U Test
from scipy.stats import mannwhitneyu
data1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]
data2 = [1.142, -0.432, -0.938, -0.729, -0.846, -0.157, 0.500, 1.183, -1.075, -0.169]
stat, p = mannwhitneyu(data1, data2)
print('stat=%.3f, p=%.3f' % (stat, p))
if p > 0.05:
print('Probably the same distribution')
else:
print('Probably different distributions')
More Information
Tests whether the distributions of two paired samples are equal or not.
Assumptions
Interpretation
Python Code
# Example of the Wilcoxon Signed-Rank Test
from scipy.stats import wilcoxon
data1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]
data2 = [1.142, -0.432, -0.938, -0.729, -0.846, -0.157, 0.500, 1.183, -1.075, -0.169]
stat, p = wilcoxon(data1, data2)
print('stat=%.3f, p=%.3f' % (stat, p))
if p > 0.05:
print('Probably the same distribution')
else:
print('Probably different distributions')
More Information
Tests whether the distributions of two or more independent samples are equal or not.
Assumptions
Interpretation
Python Code
# Example of the Kruskal-Wallis H Test
from scipy.stats import kruskal
data1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]
data2 = [1.142, -0.432, -0.938, -0.729, -0.846, -0.157, 0.500, 1.183, -1.075, -0.169]
stat, p = kruskal(data1, data2)
print('stat=%.3f, p=%.3f' % (stat, p))
if p > 0.05:
print('Probably the same distribution')
else:
print('Probably different distributions')
More Information
Tests whether the distributions of two or more paired samples are equal or not.
Assumptions
Interpretation
Python Code
# Example of the Friedman Test
from scipy.stats import friedmanchisquare
data1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]
data2 = [1.142, -0.432, -0.938, -0.729, -0.846, -0.157, 0.500, 1.183, -1.075, -0.169]
data3 = [-0.208, 0.696, 0.928, -1.148, -0.213, 0.229, 0.137, 0.269, -0.870, -1.204]
stat, p = friedmanchisquare(data1, data2, data3)
print('stat=%.3f, p=%.3f' % (stat, p))
if p > 0.05:
print('Probably the same distribution')
else:
print('Probably different distributions')
More Information
This section provides more resources on the topic if you are looking to go deeper.
https://blog.csdn.net/u013555719/article/details/84550700
# 使用固定长度循环pop方法删除列表元素
num_list_1 = [1, 2, 2, 2, 3]
for i in range(len(num_list_1)):
if num_list_1[i] == 2:
num_list_1.pop(i)
else:
print(num_list_1[i])
print("num_list_1:", num_list_1)
# IndexError: list index out of range
原因是在删除list中的元素后,list的实际长度变小了,但是循环次数没有减少,依然按照原来list的长度进行遍历,所以会造成索引溢出
不能删除连续的情况
# 正序循环遍历删除列表元素
num_list_2 = [1, 2, 2, 2, 3]
for item in num_list_2:
if item == 2:
num_list_2.remove(item)
else:
print("item", item)
print("num_list_2", num_list_2)
print("after remove op", num_list_2)
# item 1
# num_list [1, 2, 2, 2, 3]
# num_list [1, 2, 2, 3]
# num_list [1, 2, 3]
# after remove op [1, 2, 3]
当符合条件,删除元素[2]之后,后面的元素全部往前移,但是索引并不会随着值向前移动而变化,而是接着上一个位置向后移动。这样就会漏掉解
# 倒序循环遍历删除列表元素
num_list_3 = [1, 2, 2, 2, 3]
for item in num_list_3[::-1]:
if item == 2:
num_list_3.remove(item)
else:
print("item", item)
print("num_list_3", num_list_3)
print("after remove op", num_list_3)
# item 3
# num_list_3 [1, 2, 2, 2, 3]
# num_list_3 [1, 2, 2, 3]
# num_list_3 [1, 2, 3]
# num_list_3 [1, 3]
# item 1
# num_list_3 [1, 3]
# after remove op [1, 3]
原始的list是num_list,那么其实,num_list[:]是对原始的num_list的一个拷贝,是一个新的list,所以,我们遍历新的list,而删除原始的list中的元素,则既不会引起索引溢出,最后又能够得到想要的最终结果。此方法的缺点可能是,对于过大的list,拷贝后可能很占内存。那么对于这种情况,可以用倒序遍历的方法来实现。
# 遍历拷贝的list,操作原始的list
num_list_4 = [1, 2, 2, 2, 3]
for item in num_list_4[:]:
if item == 2:
num_list_4.remove(item)
else:
print("item", item)
print("num_list_4", num_list_4)
print("after remove op", num_list_4)
# item 1
# num_list_4 [1, 2, 2, 2, 3]
# num_list_4 [1, 2, 2, 3]
# num_list_4 [1, 2, 3]
# num_list_4 [1, 3]
# item 3
# num_list_4 [1, 3]
# after remove op [1, 3]
pypinyin
https://zhuanlan.zhihu.com/p/374674547?utm_id=0
pip install pypinyin
from pypinyin import pinyin, lazy_pinyin, Style
pinyin('中心')
# [['zhōng'], ['xīn']]
pinyin('中心', heteronym=True) # 启用多音字模式
# [['zhōng', 'zhòng'], ['xīn']]
pinyin('中心', style=Style.FIRST_LETTER) # 设置拼音风格
# [['z'], ['x']]
# TONE2 在相应字母的后面显示音调
pinyin('中心', style=Style.TONE2, heteronym=True)
# [['zho1ng', 'zho4ng'], ['xi1n']]
# TONE3 拼音的最后显示音调
pinyin('中心', style=Style.TONE3, heteronym=True)
# [['zhong1', 'zhong4'], ['xin1']]
lazy_pinyin('中心') # 不考虑多音字的情况
# ['zhong', 'xin']
lazy_pinyin('战略', v_to_u=True) # 不使用 v 表示 ü
# ['zhan', 'lüe']
# 使用 5 标识轻声
lazy_pinyin('衣裳', style=Style.TONE3, neutral_tone_with_five=True)
# ['yi1', 'shang5']
使用命令行一键识别拼音:
python -m pypinyin 音乐
# yīn yuè
详见参考链接