python - Extract subarrays of numpy array whose values are above a threshold -


i have sound signal, imported numpy array , want cut chunks of numpy arrays. however, want chunks contain elements above threshold. example:

threshold = 3 signal = [1,2,6,7,8,1,1,2,5,6,7] 

should output 2 arrays

vec1 = [6,7,8] vec2 = [5,6,7] 

ok, above lists, point.

here tried far, kills ram

def slice_raw_audio(audio_signal, threshold=5000):      signal_slice, chunks = [], []      idx in range(0, audio_signal.shape[0], 1000):         while audio_signal[idx] > threshold:             signal_slice.append(audio_signal[idx])          chunks.append(signal_slice)     return chunks 

here's 1 approach -

def split_above_threshold(signal, threshold):     mask = np.concatenate(([false], signal > threshold, [false] ))     idx = np.flatnonzero(mask[1:] != mask[:-1])     return [signal[idx[i]:idx[i+1]] in range(0,len(idx),2)] 

sample run -

in [48]: threshold = 3     ...: signal = np.array([1,1,7,1,2,6,7,8,1,1,2,5,6,7,2,8,7,2])     ...:   in [49]: split_above_threshold(signal, threshold) out[49]: [array([7]), array([6, 7, 8]), array([5, 6, 7]), array([8, 7])] 

runtime test

other approaches -

# @psidom's soln def arange_diff(signal, threshold):     above_th = signal > threshold     index, values = np.arange(signal.size)[above_th], signal[above_th]     return np.split(values, np.where(np.diff(index) > 1)[0]+1)  # @kasramvd's soln    def split_diff_step(signal, threshold):        return np.split(signal, np.where(np.diff(signal > threshold))[0] + 1)[1::2] 

timings -

in [67]: signal = np.random.randint(0,9,(100000))  in [68]: threshold = 3  # @kasramvd's soln  in [69]: %timeit split_diff_step(signal, threshold) 10 loops, best of 3: 39.8 ms per loop  # @psidom's soln in [70]: %timeit arange_diff(signal, threshold) 10 loops, best of 3: 20.5 ms per loop  in [71]: %timeit split_above_threshold(signal, threshold) 100 loops, best of 3: 8.22 ms per loop 

Comments