Python学习笔记——数据处理2019-05-02 – AJohn11 – Python量化投资

Python学习笔记——数据处理2019-05-02 – AJohn11

  • NumPy,数值计算的基础包。它定义了数值数组和矩阵类型以及它们的基本操作。
  • SciPy的库,数值算法和特定领域的工具箱,包括信号处理,优化,统计和更多的集合。
  • Matplotlib是一个成熟且受欢迎的绘图软件包,可提供出版品质的2D绘图以及基本的3D绘图

在此基础上,SciPy生态系统包括用于数据管理和计算,高效实验和高性能计算的通用和专用工具。下面我们概述了一些关键的包,尽管有更多相关的包

数据和计算:

  • pandas,提供高性能,易于使用的数据结构。
  • SymPy,用于符号数学和计算机代数。
  • scikit-image是用于图像处理的算法的集合。
  • scikit-learn是用于机器学习的算法和工具的集合。
  • h5pyPyTables都可以访问以HDF5格式存储的数据。

生产力和高性能计算:

  • IPython是一个丰富的交互式界面,可让您快速处理数据和测试想法。
  • Jupyter笔记本提供了Web浏览器IPython的功能多,让您在轻松重现的形式记录您的计算。
  • Cython扩展了Python语法,以便您可以方便地构建C扩展,既可以加速关键代码,也可以与C / C ++库集成。
  • DaskJoblibIPyParallel用于分布式处理,重点是数字数据。

质量保证:

  • nose,一个测试Python代码的框架,逐步淘汰优先于pytest
  • numpydoc,用于记录Scientific Python库的标准和库。

The SciPy ecosystem

Scientific computing in Python builds upon a small core of packages:

  • Python, a general purpose programming language. It is interpreted and dynamically typed and is very suited for interactive work and quick prototyping, while being powerful enough to write large applications in.
  • NumPy, the fundamental package for numerical computation. It defines the numerical array and matrix types and basic operations on them.
  • The SciPy library, a collection of numerical algorithms and domain-specific toolboxes, including signal processing, optimization, statistics and much more.
  • Matplotlib, a mature and popular plotting package, that provides publication-quality 2D plotting as well as rudimentary 3D plotting

On this base, the SciPy ecosystem includes general and specialised tools for data management and computation, productive experimentation and high-performance computing. Below we overview some key packages, though there are many more relevant packages.

Data and computation:

  • pandas, providing high-performance, easy to use data structures.
  • SymPy, for symbolic mathematics and computer algebra.
  • scikit-image is a collection of algorithms for image processing.
  • scikit-learn is a collection of algorithms and tools for machine learning.
  • h5py and PyTables can both access data stored in the HDF5 format.

Productivity and high-performance computing:

  • IPython, a rich interactive interface, letting you quickly process data and test ideas.
  • The Jupyter notebook provides IPython functionality and more in your web browser, allowing you to document your computation in an easily reproducible form.
  • Cython extends Python syntax so that you can conveniently build C extensions, either to speed up critical code, or to integrate with C/C++ libraries.
  • Dask, Joblib or IPyParallel for distributed processing with a focus on numeric data.

Quality assurance:

  • nose, a framework for testing Python code, being phased out in preference for pytest.
  • numpydoc, a standard and library for documenting Scientific Python libraries.

© 著作权归作者所有,转载或内容合作请联系作者
https://www.jianshu.com/p/0d2e0d029c89

「点点赞赏,手留余香」

    还没有人赞赏,快来当第一个赞赏的人吧!
0 条回复 A 作者 M 管理员
    所有的伟大,都源于一个勇敢的开始!
欢迎您,新朋友,感谢参与互动!欢迎您 {{author}},您在本站有{{commentsCount}}条评论