How to convert paramiko output to array

Paramiko’s exec_command returns stdout and stderr as file-like objects. To process command-line data—such as CSV or whitespace-delimited tables—you can read the output, split into lines, and convert to Python lists or NumPy arrays. This guide demonstrates four approaches: pure Python lists, csv module, pandas, and numpy.

1. Capture SSH Output with Paramiko

import paramiko

ssh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
ssh.connect('hostname', username='user', password='pass')

stdin, stdout, stderr = ssh.exec_command('cat /path/to/data.txt')
output = stdout.read().decode('utf-8')
ssh.close()
  

Note: Always call read() or iterate stdout before closing connection.

2. Convert to Python List of Lists

Use str.splitlines() and str.split() for whitespace-delimited output.

# Example whitespace-delimited output:
# "1.0 2.0 3.0\n4.0 5.0 6.0\n"

lines = output.splitlines()
data = [list(map(float, line.split())) for line in lines if line.strip()]

print(data)
# [[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]
    

3. Parse CSV Output with csv Module

If the remote command produces comma-separated values:

import io, csv

# Example CSV output: "1,2,3\n4,5,6\n"
reader = csv.reader(io.StringIO(output))
data_csv = [list(map(int, row)) for row in reader]
print(data_csv)
# [[1, 2, 3], [4, 5, 6]]
    

4. Load into pandas.DataFrame

Pandas handles various delimiters and headers automatically:

import pandas as pd
import io

# Automatically infer delimiter, include header if present
df = pd.read_csv(io.StringIO(output), sep=None, engine='python')
print(df)
#      A  B  C
# 0    1  2  3
# 1    4  5  6
data_list = df.values.tolist()
print(data_list)
    

Tip: Use pd.read_table with delim_whitespace=True for space-delimited data.

See also  How to Resolve SFTPError: Permission denied: File Permissions and Ownership Problems in Paramiko

5. Convert to numpy.ndarray

For numerical computations, convert strings directly to arrays with numpy.fromstring or genfromtxt:

import numpy as np

# Using fromstring for whitespace-delimited
array_ws = np.fromstring(output, sep=' ').reshape(-1, 3)
print(array_ws)
# [[1. 2. 3.]
#  [4. 5. 6.]]

# Using genfromtxt for CSV
array_csv = np.genfromtxt(io.StringIO(output), delimiter=',')
print(array_csv)
    

Note: fromstring is fastest but requires uniform rows; genfromtxt handles missing data elegantly.

See also  How to Understand and Handle MissingHostKeyPolicy in Paramiko

6. Handling Large Outputs Efficiently

  • Stream line-by-line to avoid large string in memory:
    data = []
    for line in stdout:
        values = list(map(float, line.split()))
        data.append(values)
          
  • Use pandas.read_csv with chunking:
    chunks = pd.read_csv(io.StringIO(output), chunksize=1000, sep=None, engine='python')
    for chunk in chunks:
        process(chunk.values)
          

7. Summary Checklist

  1. Use stdout.read().decode() or iterate stdout to capture remote output.
  2. For simple whitespace data, split lines and map to floats.
  3. Use csv.reader for CSV parsing.
  4. Load into pandas.DataFrame for advanced features.
  5. Convert directly to numpy.ndarray with fromstring or genfromtxt.
  6. Stream large outputs line-by-line or in chunks to conserve memory.
See also  Solving No module named paramiko issue