# Indexing

To better understand how the indexing with a python list works, I tried this.
First, `a = [0, 1, 2, 3, 4, 5]`.
1. `a[4:6]` gives `[4, 5]`, as I learned before.
2. `a[-1:2]` gives `[]`.
3. `a[-2:]` gives `[4, 5]`.
4. `a[-5:2]` gives ``.
Here is my understanding. The zero and positive numbers counts elements from the left. The negative numbers counts from the right. When we write `[ind1:ind2]`, the element indicated by `ind1` should be on the left side of the element pointed by `ind2`.

Advertisements

# Generate batches for RNN

This week, I encountered implementations of a function generating batches for a recurrent neural network. The first one was:

```def get_batches(arr, batch_size, n_steps):
chars_per_batch = batch_size * n_steps
n_batches = len(arr)//chars_per_batch

arr = arr[:n_batches * chars_per_batch]
arr = arr.reshape((batch_size, -1))

for n in range(0, arr.shape, n_steps):
x = arr[:, n:n+n_steps]
y = np.zeros_like(x)
y[:, :-1], y[:, -1] = x[:, 1:], x[:, 0]

yield x, y
```

Let me use an example. Suppose that we make batches from the sentence The quick brown fox jumps over the lazy dog. Let’s say the variable `n_steps` is 5. Then, the above function generates batched as follows:
In the first batch, `x` and ‘y’ become `['T', 'h', 'e', ' ', 'q']` and `['h', 'e', ' ', 'q', 'T']`, respectively. And the second batch will be `x = ['u', 'i', 'c', 'k', ' ']` and `y = ['i', 'c', 'k', ' ', 'u']`.

So, the above function shifts the elements of `x` by one position to fill `y` and puts the first element of `x` into the last element of `y`. However, one may want to feed the real next character to `y` at the end of a batch. For the first batch, `u` instead of `T`, and for the second batch, `b` instead of `u`. This seems to happen in the following implementation:

```def get_batches(arr, batch_size, n_steps):
chars_per_batch = batch_size * n_steps
n_batches = len(arr)//chars_per_batch

arr = arr[:n_batches * chars_per_batch]
arr = arr.reshape((batch_size, -1))

for n in range(0, arr.shape, n_steps):
x = arr[:, n:n+n_steps]

y_temp = arr[:, n+1:n+n_steps+1]

y = np.zeros(x.shape, dtype=x.dtype)
y[:,:y_temp.shape] = y_temp

yield x, y
```

This looks okay, except for the very last element of `arr`. After reshaped, the number of columns of `arr` or `arr.shape` should be `n_batches * n_steps`. At the last iteration of the `for` loop, `n` is supposed to be `arr.shape - n_steps - 1`. So, `x` can be filled with the last batch. Then, for `y_temp`, it tries to address the `arr.shape`th element, which is not possible. Interestingly, I don’t get any errors.1

I don’t understand how python doesn’t raise any errors with the second implementation, considering that most of errors I get are related to the addressing and slicing of arrays. Finally, I have found a better implementation from someone’s github.

```def get_batches(arr, batch_size, n_steps):
chars_per_batch = batch_size * n_steps
n_batches = len(arr)//chars_per_batch

arr = arr[:n_batches * chars_per_batch]
arr = arr.reshape((batch_size, -1))

for n in range(0, arr.shape, n_steps):
x = arr[:, n:n+n_steps]

y = np.zeros_like(x)

try:
y[:, :-1], y[:, -1] = x[:, 1:], arr[:, n+n_steps]
except IndexError:
y[:, :-1], y[:, -1] = x[:, 1:], arr[:, 0]

y[:,:y_temp.shape] = y_temp

yield x, y
```

This implementation takes care of a potential error at the last batch. And I noticed that it feeds the very first element of `arr` into the last element of the last batch. In the second implementation, the element is left to be zero.

1. I dug a bit and learned that `n+1:n+n_steps+1` is treated as a slice object. I guess that it works like a generator and that it is designed to return an empty array `[]` if there is no corresponding element. Or I would say that it implicitly takes care of exceptions.

# x vs. x[0,2]

From SciPy.org,

So note that x[0,2] = x though the second case is more inefficient as a new temporary array is created after the first index that is subsequently indexed by 2.

I guess I always use `x[0,2]` instead of `x`.

# Set types of properties in MATLAB classes

https://undocumentedmatlab.com/blog/setting-class-property-types

I got an error when I tried to initialize a property of a class to another MATLAB class. The error read:

```Conversion to double from (Class Name) is not possible.
```

So I looked for a way to specify the type of a property and found the link above. In short, `@Type` needs to follow the name of a property like

```classdef (Class Name)
properties
property@Type
end
end
```

Or `property Type` is also possible since MATLAB R2016a.

# Jupyter Notebook in a VirtualEnv

With more than one virtual environments, there is an additional setup if one wants to use Jupyter Notebook. In order to check whether Jupyter Notebook is using the same python executable, run the following scripts in Jupyter Notebook:

```import sys
sys.executable
```

and

```!which python3
```

I explicitly write `python3` but if you are using `python2`, `python` may be enough. If you see different outputs for the above scripts, then there is a problem.

From now on, I followed Kernels for different environments in IPython Documentation. First, I activated a virtual environment and installed ipykernel:

```source (path to a folder)/bin/activate (name)
python3 -m ipykernel install --user --name name --display-name "display name"
```

It seems that Jupyter uses `name` internally. I did not try yet without `name` when I activated a virtual environment. And `display name` is what we are seeing when we select `New` in Jupyter notebook, as shown below. I have display names “TensorFlow” and “DataScienceBowl” in addition to “Python 3”.

Once a new notebook is created, try the scripts shown at the top to be sure that the correct python executable is running. We can switch the python executable through the `Kernel` menu.

It seems that it may be possible to run a python executable in a different environment within a notebook. I would check this later.

# Formatting a hard drive

This is one of those simple things if one knows what to do. If not, it takes some time to find out the solutions.

One need to select an actual device instead of a volume in order to erase a hard drive. Thanks to https://discussions.apple.com/thread/8144371