Scientific Python¶

Author : Juhyung Kang (Seoul National University)¶

Email : jhkang@astro.snu.ac.kr¶

In this section, we learn some basic scientific functions and packages. We use the NumPy, SciPy, Matplotlib, Interpolation, AstroPy and SunPy packages.

Required packages

Python v2.7 or 3.5 or 3.6

NumPy v1.7+

SciPy v1.0+

Matplotlib v2.0+

AstroPy v2.0+

SunPy v0.8+ (Warning ! : This package is developing package, so that some of functions can be depricated)

Interpolation v0.18+

Above requried packages can be downloaded by using conda command in your cmd prompt or terminal as below :

conda install -c condaforge sunpy astropy interpolation

This tutorial is based on the Scipy-Lecture notes.

You can download this tutorial file on the Scientific-python FISS website below

http://fiss.snu.ac.kr/tutorial/scientific_python ¶

and some examples files.

1. Compatibility¶

Some of calling syntaxes are differen according to python version. The main differences are print statement and division operator. In python 2, division operator '/' is the floor division, but in python3 it isn't the floor division. In python3 floor division syntax is '//' To avoid this compatibility problem, we import the "future" method.

In [1]:

from __future__ import division, print_function

2. Array (NumPy package)¶

To call arrays in python, we have to import the numpy package at first.

In [2]:

import numpy as np

2.1. Manual construction of arrays¶

In [3]:

a = np.array([1,5,2,4,3,0])
a

Out[3]:

array([1, 5, 2, 4, 3, 0])

This array can be sort by using sort function

In [4]:

a.sort()
a

Out[4]:

array([0, 1, 2, 3, 4, 5])

In [5]:

a.dtype

Out[5]:

dtype('int32')

In [6]:

b = np.array([[0,1,2],[3,4,5]],dtype = 'float')
b

Out[6]:

array([[ 0.,  1.,  2.],
       [ 3.,  4.,  5.]])

In [7]:

b.dtype

Out[7]:

dtype('float64')

In [8]:

b.shape          # Same with the size function in IDL

Out[8]:

(2, 3)

In [9]:

len(b)           # returns the size of the first dimension

Out[9]:

convert row array to column array

In [10]:

c = a[:,None]
c

Out[10]:

array([[0],
       [1],
       [2],
       [3],
       [4],
       [5]])

2.2 Functions for creating arrays¶

In [11]:

a = np.arange(6)   # similar with the *indgen functions in IDL
a

Out[11]:

array([0, 1, 2, 3, 4, 5])

In [12]:

b = np.arange(0,10,0.1)
b

Out[12]:

array([ 0. ,  0.1,  0.2,  0.3,  0.4,  0.5,  0.6,  0.7,  0.8,  0.9,  1. ,
        1.1,  1.2,  1.3,  1.4,  1.5,  1.6,  1.7,  1.8,  1.9,  2. ,  2.1,
        2.2,  2.3,  2.4,  2.5,  2.6,  2.7,  2.8,  2.9,  3. ,  3.1,  3.2,
        3.3,  3.4,  3.5,  3.6,  3.7,  3.8,  3.9,  4. ,  4.1,  4.2,  4.3,
        4.4,  4.5,  4.6,  4.7,  4.8,  4.9,  5. ,  5.1,  5.2,  5.3,  5.4,
        5.5,  5.6,  5.7,  5.8,  5.9,  6. ,  6.1,  6.2,  6.3,  6.4,  6.5,
        6.6,  6.7,  6.8,  6.9,  7. ,  7.1,  7.2,  7.3,  7.4,  7.5,  7.6,
        7.7,  7.8,  7.9,  8. ,  8.1,  8.2,  8.3,  8.4,  8.5,  8.6,  8.7,
        8.8,  8.9,  9. ,  9.1,  9.2,  9.3,  9.4,  9.5,  9.6,  9.7,  9.8,
        9.9])

In [13]:

np.arange?

In [14]:

c = np.linspace(0,10,101)
c

Out[14]:

array([  0. ,   0.1,   0.2,   0.3,   0.4,   0.5,   0.6,   0.7,   0.8,
         0.9,   1. ,   1.1,   1.2,   1.3,   1.4,   1.5,   1.6,   1.7,
         1.8,   1.9,   2. ,   2.1,   2.2,   2.3,   2.4,   2.5,   2.6,
         2.7,   2.8,   2.9,   3. ,   3.1,   3.2,   3.3,   3.4,   3.5,
         3.6,   3.7,   3.8,   3.9,   4. ,   4.1,   4.2,   4.3,   4.4,
         4.5,   4.6,   4.7,   4.8,   4.9,   5. ,   5.1,   5.2,   5.3,
         5.4,   5.5,   5.6,   5.7,   5.8,   5.9,   6. ,   6.1,   6.2,
         6.3,   6.4,   6.5,   6.6,   6.7,   6.8,   6.9,   7. ,   7.1,
         7.2,   7.3,   7.4,   7.5,   7.6,   7.7,   7.8,   7.9,   8. ,
         8.1,   8.2,   8.3,   8.4,   8.5,   8.6,   8.7,   8.8,   8.9,
         9. ,   9.1,   9.2,   9.3,   9.4,   9.5,   9.6,   9.7,   9.8,
         9.9,  10. ])

In [15]:

np.linspace?

In [16]:

one = np.ones((3,3))

In [17]:

one

Out[17]:

array([[ 1.,  1.,  1.],
       [ 1.,  1.,  1.],
       [ 1.,  1.,  1.]])

In [18]:

zero = np.zeros((2,3))

In [19]:

zero

Out[19]:

array([[ 0.,  0.,  0.],
       [ 0.,  0.,  0.]])

In [20]:

np.random.seed(2018)

In [21]:

randu = np.random.rand(4)   #uniform random number generator
randu

Out[21]:

array([ 0.88234931,  0.10432774,  0.90700933,  0.3063989 ])

In [22]:

randg = np.random.randn(5) #gaussian random number generator
randg

Out[22]:

array([ 2.14839926, -1.279487  ,  0.50227689,  0.8560293 , -0.14279008])

Exercise 1) Create an array which has equaily spaced elements from 5 to 100 with step size = 5 .¶

Exercise 2) Create an array which has 100 equaily spaced elements from 0 to 2¶

2.3. Indexing and slicing¶

In [23]:

a = np.arange(10)

In [24]:

a[0]

Out[24]:

In [25]:

a[:]

Out[25]:

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [26]:

a[3:6]

Out[26]:

array([3, 4, 5])

In [27]:

a[3:8:2]

Out[27]:

array([3, 5, 7])

In [28]:

a[3:]

Out[28]:

array([3, 4, 5, 6, 7, 8, 9])

Array reshaping¶

In [29]:

d=np.arange(25)

In [30]:

d = d.reshape((5,5))

In [31]:

Out[31]:

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])

In [32]:

d[3]

Out[32]:

array([15, 16, 17, 18, 19])

In [33]:

d[:,3]

Out[33]:

array([ 3,  8, 13, 18, 23])

In [34]:

b = d.transpose()

In [35]:

b[3]

Out[35]:

array([ 3,  8, 13, 18, 23])

In [36]:

d[:2]

Out[36]:

array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

2.4. Broadcasting¶

In [37]:

a = np.arange(5)

In [38]:

b = np.ones(5, dtype=int)

In [39]:

Out[39]:

array([0, 1, 2, 3, 4])

In [40]:

Out[40]:

array([1, 1, 1, 1, 1])

In [41]:

c = a+b[:,None]

In [42]:

Out[42]:

array([[1, 2, 3, 4, 5],
       [1, 2, 3, 4, 5],
       [1, 2, 3, 4, 5],
       [1, 2, 3, 4, 5],
       [1, 2, 3, 4, 5]])

In [43]:

c + a

Out[43]:

array([[1, 3, 5, 7, 9],
       [1, 3, 5, 7, 9],
       [1, 3, 5, 7, 9],
       [1, 3, 5, 7, 9],
       [1, 3, 5, 7, 9]])

In [44]:

c + a[:,None]

Out[44]:

array([[1, 2, 3, 4, 5],
       [2, 3, 4, 5, 6],
       [3, 4, 5, 6, 7],
       [4, 5, 6, 7, 8],
       [5, 6, 7, 8, 9]])

Exercise 3) Create the $10 \times 10$ multiplication table. (Do not use 'for' loop.)

Multiplication

2.5. Boolean operation¶

In [45]:

Out[45]:

array([[1, 2, 3, 4, 5],
       [1, 2, 3, 4, 5],
       [1, 2, 3, 4, 5],
       [1, 2, 3, 4, 5],
       [1, 2, 3, 4, 5]])

In [46]:

c == 1

Out[46]:

array([[ True, False, False, False, False],
       [ True, False, False, False, False],
       [ True, False, False, False, False],
       [ True, False, False, False, False],
       [ True, False, False, False, False]], dtype=bool)

In [47]:

Out[47]:

array([1, 1, 1, 1, 1])

In [48]:

mask = c <= b

In [49]:

mask

Out[49]:

array([[ True, False, False, False, False],
       [ True, False, False, False, False],
       [ True, False, False, False, False],
       [ True, False, False, False, False],
       [ True, False, False, False, False]], dtype=bool)

In [50]:

Out[50]:

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])

In [51]:

d[mask]

Out[51]:

array([ 0,  5, 10, 15, 20])

In [52]:

np.where(mask)

Out[52]:

(array([0, 1, 2, 3, 4], dtype=int64), array([0, 0, 0, 0, 0], dtype=int64))

In [53]:

True or False

Out[53]:

True

In [54]:

True and False

Out[54]:

False

In [55]:

mask1 = np.array([True,False,False,True,False])
mask2 = np.array([True,True,False,True,False])

In [56]:

mask1 or mask

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-56-df0b0dad315b> in <module>()
----> 1 mask1 or mask

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

In [57]:

mask1 and mask2

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-57-d93ac612a2e9> in <module>()
----> 1 mask1 and mask2

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

In [58]:

np.logical_and(mask1, mask2)

Out[58]:

array([ True, False, False,  True, False], dtype=bool)

In [59]:

np.logical_or(mask1, mask2)

Out[59]:

array([ True,  True, False,  True, False], dtype=bool)

2.6. Basic reduction¶

Basic Reduction

In [60]:

d[0,3]=-1
d[2,2]=-10

In [61]:

Out[61]:

array([[  0,   1,   2,  -1,   4],
       [  5,   6,   7,   8,   9],
       [ 10,  11, -10,  13,  14],
       [ 15,  16,  17,  18,  19],
       [ 20,  21,  22,  23,  24]])

In [62]:

d.sum()

Out[62]:

In [63]:

d.sum(0)

Out[63]:

array([50, 55, 38, 61, 70])

In [64]:

d.mean(1)

Out[64]:

array([  1.2,   7. ,   7.6,  17. ,  22. ])

In [65]:

d.min(1)

Out[65]:

array([ -1,   5, -10,  15,  20])

In [66]:

d.argmin(1)

Out[66]:

array([3, 0, 2, 0, 0], dtype=int64)

Exercise 4) Create the 10x10 gaussian random number, and find the minimum value along the axis 0

3. Plotting (Matplotlib package)¶

In this section, we are going to learn a simple way to visualize data in interactive mode by using the matplotlib package. matplotlib package is based on the Matlab plot device.

First we change the default backend mode to interactive mode in the IPython console.

In [110]:

%matplotlib

Using matplotlib backend: Qt5Agg

If you use the Jupyter notebook like this manual, type the magic funcition like below:

In [67]:

%matplotlib notebook

or :

In [67]:

%matplotlib inline

3.1. Simple plot¶

Then import the pyplot module in matplotlib package.

In [68]:

import matplotlib.pyplot as plt

Now we draw the cosine and sine functiuons on the same plot from $0$ to $2\pi$.

In [69]:

theta = np.linspace(-np.pi, np.pi,151)
c = np.cos(theta)
s = np.sin(theta)

In [70]:

plt.plot(theta, c, lw = 2.5, label= r'cos($\theta$)')
plt.plot(theta, s, label= r'sin($\theta$)')

Out[70]:

[<matplotlib.lines.Line2D at 0xf5bb4ef8d0>]

In [71]:

plt.xlim(-np.pi,np.pi)    # set the x limit. Same with the xrange keyward in IDL plot processor

Out[71]:

(-3.141592653589793, 3.141592653589793)

In [72]:

plt.ylim(-1,1)

Out[72]:

(-1, 1)

In [74]:

plt.legend(loc=1)

Out[74]:

<matplotlib.legend.Legend at 0xf5bbc774e0>

In [75]:

plt.tight_layout()

In [76]:

plt.savefig(r'C:\Tutorial\tri_fun.png')

In [77]:

plt.savefig(r'C:\Tutorial\tri_fun.eps')

Let's check the plot keywards.

In [78]:

plt.plot?

Exercise 5) Plot the tan function in red dashed line in $\theta$ range from $-\frac{2\pi}{5}$ to $\frac{2\pi}{5}$, and insert the legendary.

3.2. Subplots¶

Before we start this section, we change the plot setting. First we change the tick direction.

In [76]:

plt.rc('xtick', direction='in')
plt.rc('ytick', direction='in')
plt.rc('xtick.minor',visible=True)
plt.rc('ytick.minor',visible=True)

In [77]:

fig, ax = plt.subplots(2, 2, figsize=(9,7))

3.2.1. Moving spines & Scatter plot¶

In [78]:

x = np.linspace(-np.pi, np.pi, 15)
y = np.sin(x)

In [79]:

p1=ax[0,0].plot(theta, s, 'k')

In [80]:

ax[0,0].scatter(x, y, marker='o', c='b', edgecolors= 'b')

Out[80]:

<matplotlib.collections.PathCollection at 0xf5bcca00b8>

In [81]:

ax[0,0].set_xlim(-np.pi,np.pi)
ax[0,0].set_ylim(-1.1,1.1)

Out[81]:

(-1.1, 1.1)

In [82]:

ax[0,0].set_title('Moving spines')

Out[82]:

Text(0.5,1,'Moving spines')

In [83]:

for i in ['top', 'right'] :
    ax[0,0].spines[i].set_color('none')

In [84]:

ax[0,0].xaxis.set_ticks_position('bottom')
ax[0,0].yaxis.set_ticks_position('left')
for i in ['left', 'bottom'] :
    ax[0,0].spines[i].set_position(('data',0))

In [85]:

p1.pop(0).remove()

3.2.2. Error plot¶

In [86]:

yerr= np.sqrt(np.abs(y-0.5))+0.1

In [87]:

ax[0,1].errorbar(x, y, xerr=0.15, yerr=yerr,fmt='-o', capthick=1, capsize=5, color = 'k')

Out[87]:

<Container object of 3 artists>

In [88]:

ax[0,1].set_title('Error plot')

Out[88]:

Text(0.5,1,'Error plot')

3.2.3. Plot image¶

In [89]:

gx= np.linspace(-5,5,500)
gy= np.linspace(-4,4,400)
gauss2d = np.exp(-0.5*(gx/gx.std())**2-0.5*(gy[:,None]/gy.std())**2)

In [90]:

gauss2d.shape

Out[90]:

(400, 500)

In [91]:

im= ax[1,0].imshow(gauss2d)

In [92]:

im.remove()

In [93]:

im= ax[1,0].imshow(gauss2d, origin='lower', cmap=plt.cm.rainbow)

In [94]:

im.remove()

In [95]:

im= ax[1,0].imshow(gauss2d, origin='lower', cmap=plt.cm.rainbow, extent= [-5, 5, -2, 2])

In [96]:

cbar= plt.colorbar(im, ax=ax[1,0], pad=0)

In [97]:

cbar.set_label('Rainbow')

In [98]:

ax[1,0].set_title('Image plot')

Out[98]:

Text(0.5,1,'Image plot')

3.2.4. Contour¶

In [99]:

ax[1,1].contourf(gx, gy, gauss2d, 5, cmap=plt.cm.rainbow)

Out[99]:

<matplotlib.contour.QuadContourSet at 0xf5be7a6e48>

In [100]:

c= ax[1,1].contour(gx, gy, gauss2d, 5, colors='k')

In [101]:

ax[1,1].clabel(c)

Out[101]:

<a list of 7 text.Text objects>

In [102]:

ax[1,1].set_title('Contour')

Out[102]:

Text(0.5,1,'Contour')

In [103]:

fig.tight_layout(h_pad=0,w_pad=0)

3.3. Inset figure¶

In some cases, we may draw the zoomed a portion of image on the original image like below. zoom-in In order to insert inset figure on the original image, we first import the mpl_toolkits.

In [104]:

from mpl_toolkits.axes_grid1.inset_locator import zoomed_inset_axes, mark_inset

Then we download the sample image.

In [105]:

import os

In [106]:

dirn = r'C:\Tutorial\sample'
filen = 'goblin.png'

In [107]:

file=os.path.join(dirn,filen)

In [108]:

file

Out[108]:

'C:\\Tutorial\\sample\\goblin.png'

In [109]:

goblin=plt.imread(file)

In [110]:

goblin.shape

Out[110]:

(138, 141, 3)

In [111]:

fig = plt.figure(figsize=(6,6))

In [112]:

ax = plt.subplot(111)

In [113]:

im = ax.imshow(goblin, origin='lower')

In [114]:

xlim = (57, 78)
ylim = (32, 53)

In [115]:

axins = zoomed_inset_axes(ax, 2, loc = 4)

In [116]:

axins.imshow(goblin, origin='lower')

Out[116]:

<matplotlib.image.AxesImage at 0xf5bb6fa5c0>

In [117]:

axins.set_xlim(xlim)
axins.set_ylim(ylim)

Out[117]:

(32, 53)

In [118]:

axins.tick_params(left='off', right='off', top='off', bottom='off', labelleft='off', labelbottom='off')

In [119]:

mark_inset(ax, axins, loc1=1, loc2=3)

Out[119]:

(<mpl_toolkits.axes_grid1.inset_locator.BboxPatch at 0xf5bed28f28>,
 <mpl_toolkits.axes_grid1.inset_locator.BboxConnector at 0xf5bb6da898>,
 <mpl_toolkits.axes_grid1.inset_locator.BboxConnector at 0xf5be6ad828>)

$+ \alpha)$ Annotation¶

In [120]:

ax.annotate('Cute-eye', xy=(53, 70), xytext=(9,125), arrowprops=dict(fc='r', shrink=0.03), fontweight='bold')

Out[120]:

Text(9,125,'Cute-eye')

$\star$ Well-fitted colorbar¶

In [121]:

from mpl_toolkits.axes_grid1 import make_axes_locatable

In [122]:

ax = plt.subplot(111)

In [123]:

im = ax.imshow(gauss2d, origin='lower', cmap= plt.cm.rainbow)

In [124]:

divider = make_axes_locatable(ax)

In [125]:

cax = divider.append_axes('right', size='5%', pad=0)

In [126]:

cbar = plt.colorbar(im, cax=cax)

In [127]:

cbar.set_label('Colorbar', fontweight='bold', fontsize=12)

4. Scientific Tools (SciPy Package)¶

In this section, we learn the useful function to analyze a data.

4.1. FFT¶

"A fast Fourier transform (FFT) is an algorithm that samples a signal over a period of time (or space) and divides it into its frequency components." (Wikipedia)

Basic fft functions are contained in scipy.fftpack module.

In [128]:

from scipy import fftpack

In [129]:

fftpack.fft

Out[129]:

<function scipy.fftpack.basic.fft>

We download the example file, and apply fft to the data

In [130]:

example = r'C:/Tutorial/sample/sst_nino3.dat'

To read this we use the numpy.loadtxt() function.

In [131]:

data = np.loadtxt(example)

In [132]:

len(data)

Out[132]:

In [133]:

plt.plot(data)

Out[133]:

[<matplotlib.lines.Line2D at 0xf5bfa20828>]

In [135]:

fdata = fftpack.fft(data)

In [136]:

plt.plot(fdata)

C:\ProgramData\Anaconda3\lib\site-packages\numpy\core\numeric.py:531: ComplexWarning: Casting complex values to real discards the imaginary part
  return array(a, dtype, copy=False, order=order)

Out[136]:

[<matplotlib.lines.Line2D at 0xf5bfaaf208>]

In [137]:

repro = fftpack.ifft(fdata)

In [139]:

plt.plot(repro)

C:\ProgramData\Anaconda3\lib\site-packages\numpy\core\numeric.py:531: ComplexWarning: Casting complex values to real discards the imaginary part
  return array(a, dtype, copy=False, order=order)

Out[139]:

[<matplotlib.lines.Line2D at 0xf5c075ff98>]

In [140]:

fdata[50:452]=0

In [141]:

repro2 = fftpack.ifft(fdata)

In [142]:

plt.plot(repro2)

C:\ProgramData\Anaconda3\lib\site-packages\numpy\core\numeric.py:531: ComplexWarning: Casting complex values to real discards the imaginary part
  return array(a, dtype, copy=False, order=order)

Out[142]:

[<matplotlib.lines.Line2D at 0xf5c0c02b00>]

4.2. Convolution¶

We apply the gaussian convolution kernal to goblin image.

In this section, we use the fftconvolve function since it is faster than convolve function.

In [143]:

from scipy.signal import fftconvolve

In [144]:

kern = np.array([[1,2,1],[2,4,2],[1,2,1]])/16

In [145]:

convimg = [fftconvolve(goblin[:,:,i], kern, mode='same') for i in range(3)]

In [146]:

convimg = np.array(convimg)

In [147]:

convimg.shape

Out[147]:

(3, 138, 141)

In [148]:

convimg=convimg.transpose((1,2,0))

In [149]:

convimg.shape

Out[149]:

(138, 141, 3)

In [150]:

fig, ax = plt.subplots(1,2,figsize=(8,5))

In [151]:

ax[0].imshow(goblin)

Out[151]:

<matplotlib.image.AxesImage at 0xf5c2028ac8>

In [152]:

ax[1].imshow(convimg)

Out[152]:

<matplotlib.image.AxesImage at 0xf5c27b76d8>

4.3. Fitting¶

4.3.1. Polynomial fittng¶

Let's fit polynomial to data. Here, we use the nummpy polynomial() fitting function.

In [153]:

x = np.arange(len(data))
y = data.copy()

In [154]:

pfcoeff = np.polyfit(x, y, 1)

The polyfit function gives the fitting coefficients for given data.

In [155]:

pfcoeff

Out[155]:

array([ 0.00052717, -0.1326027 ])

To create the polynomial fitting curve, we use the numpy.polyval() function.

In [156]:

pfval = np.polyval(pfcoeff, x)

Then, let's plot the original data and fitting curve

In [157]:

plt.plot(x, y, 'k', label='Real Data')

Out[157]:

[<matplotlib.lines.Line2D at 0xf5c28fd3c8>]

In [158]:

plt.plot(x, pfval, 'r', label='Fitting Curve')

Out[158]:

[<matplotlib.lines.Line2D at 0xf5c290ec88>]

In [159]:

plt.legend()

Out[159]:

<matplotlib.legend.Legend at 0xf5c27aba90>

4.3.2. Model fitting¶

In, this sub-section, we fit sinosidal function to the data.

At first, we make a sinosidal function.

In [160]:

def func(theta, p0, p1, p2, p3):
    return p0*np.sin(p1*theta+p2)+p3

Then import the curve_fit() function.

In [161]:

from scipy.optimize import curve_fit

Now we guass an initial value for fitting function

In [162]:

init = [2, 0.5, 0.1, -1]

In [163]:

xs = x[100:150]
ys = y[100:150]

Let's fit the given function to data.

In [164]:

coeff, cov = curve_fit(func, xs, ys, p0=init)

In [165]:

print(coeff)

[ 0.77459613  0.54645094 -4.62010486  0.22718251]

In [168]:

yfit = func(xs, *coeff)

In [169]:

plt.plot(xs, ys, 'k', label='Real Data')

Out[169]:

[<matplotlib.lines.Line2D at 0xf5c07697b8>]

In [170]:

plt.plot(xs, yfit, 'r', label='Fitting Curve')

Out[170]:

[<matplotlib.lines.Line2D at 0xf5c2dd9dd8>]

4.4. Interpolation¶

In general, many python user use the scipy interpolation module for interpolate data. However, this interpolation is somewhat slower than IDL's one, since the for loop in python is very slower than any other programming language. If there are many data which have to be applyed the interpolation, that take some time. In order to solve this slow speeding interpolation, there is interpolation package which is wriiten by using the Numba package.

Numba package make python code be Just-In-Time copiled to native machine instructions, similar in performance to C, C++ and Fortran.

At first we download the interpolation package. Excute the below command line on your terimal or cmd prompt:

conda install -c conda-forge interpolation

4.4.1. Scipy interpolation (1D case)¶

Before we use this interpolation package, we learn the general interpolation function which is included in scipy package.

In [171]:

from scipy.interpolate import interp1d

In [172]:

interpf = interp1d(xs, ys, kind='cubic')

In [173]:

xin = np.linspace(xs.min(), xs.max(), 3*len(xs))

In [174]:

yin = interpf(xin)

In [175]:

plt.plot(xs, ys, 'k', label= 'original')

Out[175]:

[<matplotlib.lines.Line2D at 0xf5c2e1f240>]

In [176]:

plt.plot(xin, yin, 'r', label ='interpolated')

Out[176]:

[<matplotlib.lines.Line2D at 0xf5c4b42d30>]

In [177]:

plt.legend()

Out[177]:

<matplotlib.legend.Legend at 0xf5c4b64da0>

4.4.2. Fast Interpolation ($\geq$ 2D case)¶

This package tool will be explained in YSSS tutorial.

In [178]:

from interpolation.splines import LinearSpline

First, define the grid boundaries

In [179]:

ny, nx, nc =goblin.shape

In [180]:

lower = np.array([0,0,0])
upper = np.array([ny-1,nx-1,nc-1])
order = np.array(goblin.shape)

In [181]:

interp = LinearSpline(lower, upper, order, goblin)

In [182]:

ya = np.linspace(0,ny-1,ny*2)
xa = np.linspace(0,nx-1,nx*2)
ca = np.arange(0,nc)
zero = np.zeros((2*ny,2*nx,nc))

In [183]:

yy = ya[:,None,None]+zero
xx = xa[:,None]+zero
cc = ca+zero

In [184]:

size = yy.size

In [185]:

size

Out[185]:

In [186]:

input_a = np.array([yy.reshape(size), xx.reshape(size), cc.reshape(size)])

In [187]:

out = interp(input_a.T)
res = out.reshape((2*ny,2*nx,nc))

In [188]:

plt.imshow(res)

Out[188]:

<matplotlib.image.AxesImage at 0xf5c6f7bdd8>

4.5. Optimization (Minimization)¶

In this section we lean how to find maximum or minimum of an objective function numerically.

We use one of minimization method, Amoeba method(= Nelder-Mead method = downhill simplex method)

In [189]:

from scipy.optimize import minimize

Let's derive the minimization solution of below function.

$f(x,y) = 100(y-x^2)^2+(1-x)^2$

(1) Create the above function.

In [190]:

def minifunc(X) :
    x, y = X
    return 100*(y-x**2)**2+(1-x)**2

(2) Choose initial value.

In [191]:

X0 = [-0.5, -0.5]

In [192]:

res = minimize(minifunc, X0, method='Nelder-Mead')

In [193]:

res

Out[193]:

 final_simplex: (array([[ 1.00002709,  1.000054  ],
       [ 1.00000482,  1.00000328],
       [ 1.0000603 ,  1.00011627]]), array([  7.36938378e-10,   4.07353785e-09,   5.51563470e-09]))
           fun: 7.3693837805430572e-10
       message: 'Optimization terminated successfully.'
          nfev: 139
           nit: 75
        status: 0
       success: True
             x: array([ 1.00002709,  1.000054  ])

In [194]:

res['x']

Out[194]:

array([ 1.00002709,  1.000054  ])

Exercies 6) Find the minimized value for below function by using Nelder-Mead method starting from (x,y)=(0,0).

$f(x,y) = 2x^2+y^2-2xy+\left|x-3\right|+ \left|y-2\right|$

4.6. Noise suppresion (Savitzky-Golay Filter)¶

In general, digital data have noise. In some case noise can highly affect the real signal. To supprese this, many researcher develope smooting methods. Among those smooting methods, Savitzky-Golay Filter is most famous way to reduce noise. Savitzky-Golay filter increases the signal-to-noise ratio without greatly distorting the signal. This is achieved, in a process known as convolution, by fitting successive sub-sets of adjacent data points with a low-degree polynomial by the method of linear least squares. (Wikipedia) Sav-Gol_Filter

We use also sst_nino data in this section.

In [195]:

from scipy.signal import savgol_filter

In [196]:

sgfy = savgol_filter(y, 9, 3)

In [197]:

plt.plot(x,y,'k')

Out[197]:

[<matplotlib.lines.Line2D at 0xf5c73f49e8>]

In [198]:

plt.plot(x, sgfy, 'r')

Out[198]:

[<matplotlib.lines.Line2D at 0xf5c7394c18>]

Exercise 7) Apply the Savitzky-golay filter to sst_nino data with window length = 9 and polynomial order = 5. After that, overplot the result on the above figure. Finally, Fourier trnasform the result. Is there any diffrence with the previous result?

4.7. Read IDL sav data¶

Download the sample sav file.

In [199]:

from scipy.io import readsav

In [200]:

filen = r'C:/Tutorial/sample/sample.sav'

In [201]:

sdata = readsav(filen)

In [202]:

sdata.keys()

Out[202]:

dict_keys(['t', 'timeseries'])

In [203]:

t = sdata['t']

In [204]:

t.shape

Out[204]:

(200,)

In [207]:

series = sdata['timeseries']

In [208]:

series.shape

Out[208]:

(200, 3, 6)

In [209]:

plt.plot(t, series[:,:,1])

Out[209]:

[<matplotlib.lines.Line2D at 0xf5c73b4f98>,
 <matplotlib.lines.Line2D at 0xf5c80ec940>,
 <matplotlib.lines.Line2D at 0xf5c80eca90>]

5. Fits file handling (AstroPy Package)¶

In order to read the fits file we need to AstroPy package.

In this section we only focous on the way to read fits file. First, we dowload example the fits file.

In [210]:

filen = r'C:/Tutorial/sample/502nmos.fits'

In [211]:

from astropy.io import fits

In [212]:

data = fits.getdata(filen)
header = fits.getheader(filen)

In [213]:

header['naxis1']

Out[213]:

In [214]:

data.shape

Out[214]:

(1600, 1600)

In [215]:

plt.imshow(data,origin='lower',cmap=plt.cm.gray)

Out[215]:

<matplotlib.image.AxesImage at 0xf5c8a03390>

In [216]:

plt.clim(0,30)

In [217]:

plt.draw()

5.2. Astronomical units and constants¶

In AstroPy package, there are many useful physical constants and units.

In [218]:

import astropy.constants as c

In [219]:

kb=c.k_B   # Boltzmann constant

In [220]:

import astropy.units as u

In [221]:

T=1e3*u.K

In [222]:

E=kb*T

In [223]:

E.cgs

Out[223]:

$1.3806485 \times 10^{-13} \; \mathrm{erg}$

6. Solar software for python (SunPy)¶

Sunpy is the python version Solar Software.

It is free and open source package written in Python.

However, this package is not developed package, but developing one. So, it may occur any errors at this time. To download this package run the below terminal command

conda install -c conda-forge sunpy

In this tutorial we skip this part, If you interested in this package and need to guide documentation, please check this SunPy website and its documentation