Pandalarda sütunu ada göre tablonun önüne taşı

Question 1

İşte benim df:

                             Net   Upper   Lower  Mid  Zsore
Answer option                                                
More than once a day          0%   0.22%  -0.12%   2    65 
Once a day                    0%   0.32%  -0.19%   3    45
Several times a week          2%   2.45%   1.10%   4    78
Once a week                   1%   1.63%  -0.40%   6    65

Bir sütunu ada ( "Mid") göre tablonun önüne (dizin 0) nasıl taşıyabilirim . Sonuç şu şekilde görünmelidir:

                             Mid   Upper   Lower  Net  Zsore
Answer option                                                
More than once a day          2   0.22%  -0.12%   0%    65 
Once a day                    3   0.32%  -0.19%   0%    45
Several times a week          4   2.45%   1.10%   2%    78
Once a week                   6   1.63%  -0.40%   1%    65

Şu anki kodum kullanarak sütunu dizine göre hareket ettiriyor df.columns.tolist()ama isme göre değiştirmek istiyorum.

Question 2

ixBir listeyi geçerek yeniden sıralamak için kullanabiliriz :

In [27]:
# get a list of columns
cols = list(df)
# move the column to head of list using index, pop and insert
cols.insert(0, cols.pop(cols.index('Mid')))
cols
Out[27]:
['Mid', 'Net', 'Upper', 'Lower', 'Zsore']
In [28]:
# use ix to reorder
df = df.ix[:, cols]
df
Out[28]:
                      Mid Net  Upper   Lower  Zsore
Answer_option                                      
More_than_once_a_day    2  0%  0.22%  -0.12%     65
Once_a_day              3  0%  0.32%  -0.19%     45
Several_times_a_week    4  2%  2.45%   1.10%     78
Once_a_week             6  1%  1.63%  -0.40%     65

Diğer bir yöntem, sütuna bir referans almak ve ön tarafa yeniden yerleştirmektir:

In [39]:
mid = df['Mid']
df.drop(labels=['Mid'], axis=1,inplace = True)
df.insert(0, 'Mid', mid)
df
Out[39]:
                      Mid Net  Upper   Lower  Zsore
Answer_option                                      
More_than_once_a_day    2  0%  0.22%  -0.12%     65
Once_a_day              3  0%  0.32%  -0.19%     45
Several_times_a_week    4  2%  2.45%   1.10%     78
Once_a_week             6  1%  1.63%  -0.40%     65

Pandaların gelecekteki bir sürümünde kullanımdan kaldırılacak locolanla aynı sonucu elde etmek için de kullanabilirsiniz :ix0.20.0

df = df.loc[:, cols]

Question 3

Belki bir şeyi kaçırıyorum ama bu yanıtların çoğu aşırı derecede karmaşık görünüyor. Sütunları tek bir liste içinde ayarlayabilmelisiniz:

Öne doğru sütun:

df = df[ ['Mid'] + [ col for col in df.columns if col != 'Mid' ] ]

Veya bunun yerine onu arkaya taşımak isterseniz:

df = df[ [ col for col in df.columns if col != 'Mid' ] + ['Mid'] ]

Veya birden fazla sütunu taşımak istiyorsanız:

cols_to_move = ['Mid', 'Zsore']
df           = df[ cols_to_move + [ col for col in df.columns if col not in cols_to_move ] ]

Question 4

Pandalarda df.reindex () işlevini kullanabilirsiniz. df

                      Net  Upper   Lower  Mid  Zsore
Answer option                                      
More than once a day  0%  0.22%  -0.12%    2     65
Once a day            0%  0.32%  -0.19%    3     45
Several times a week  2%  2.45%   1.10%    4     78
Once a week           1%  1.63%  -0.40%    6     65

sütun adlarının bir listesini tanımlayın

cols = df.columns.tolist()
cols
Out[13]: ['Net', 'Upper', 'Lower', 'Mid', 'Zsore']

sütun adını istediğiniz yere taşıyın

cols.insert(0, cols.pop(cols.index('Mid')))
cols
Out[16]: ['Mid', 'Net', 'Upper', 'Lower', 'Zsore']

sonra df.reindex()yeniden sıralamak için işlevi kullanın

df = df.reindex(columns= cols)

çıktı: df

                      Mid  Upper   Lower Net  Zsore
Answer option                                      
More than once a day    2  0.22%  -0.12%  0%     65
Once a day              3  0.32%  -0.19%  0%     45
Several times a week    4  2.45%   1.10%  2%     78
Once a week             6  1.63%  -0.40%  1%     65

Question 5

Bu çözümü tercih ediyorum:

col = df.pop("Mid")
df.insert(0, col.name, col)

Okuması daha kolay ve önerilen diğer yanıtlardan daha hızlı.

def move_column_inplace(df, col, pos):
    col = df.pop(col)
    df.insert(pos, col.name, col)

Performans değerlendirme:

Bu test için, şu anda son sütun her tekrarda öne taşınır. Yerinde yöntemler genellikle daha iyi performans gösterir. Citynorman'ın çözümü yerinde yapılabilirken, Ed .locChum'un yöntemi ve sachinnm'in yöntemi reindexolamaz.

Diğer yöntemler genel olmakla birlikte, citynorman'ın çözümü ile sınırlıdır pos=0. Ben arasındaki herhangi bir performans farkı gözlemlemek vermedi df.loc[cols]ve df[cols]bazı diğer önerileri içermiyordu, bu yüzden de.

MacBook Pro'da (2015 Ortası) python 3.6.8 ve pandalar 0.24.2 ile test ettim.

import numpy as np
import pandas as pd

n_cols = 11
df = pd.DataFrame(np.random.randn(200000, n_cols),
                  columns=range(n_cols))

def move_column_inplace(df, col, pos):
    col = df.pop(col)
    df.insert(pos, col.name, col)

def move_to_front_normanius_inplace(df, col):
    move_column_inplace(df, col, 0)
    return df

def move_to_front_chum(df, col):
    cols = list(df)
    cols.insert(0, cols.pop(cols.index(col)))
    return df.loc[:, cols]

def move_to_front_chum_inplace(df, col):
    col = df[col]
    df.drop(col.name, axis=1, inplace=True)
    df.insert(0, col.name, col)
    return df

def move_to_front_elpastor(df, col):
    cols = [col] + [ c for c in df.columns if c!=col ]
    return df[cols] # or df.loc[cols]

def move_to_front_sachinmm(df, col):
    cols = df.columns.tolist()
    cols.insert(0, cols.pop(cols.index(col)))
    df = df.reindex(columns=cols, copy=False)
    return df

def move_to_front_citynorman_inplace(df, col):
    # This approach exploits that reset_index() moves the index
    # at the first position of the data frame.
    df.set_index(col, inplace=True)
    df.reset_index(inplace=True)
    return df

def test(method, df):
    col = np.random.randint(0, n_cols)
    method(df, col)

col = np.random.randint(0, n_cols)
ret_mine = move_to_front_normanius_inplace(df.copy(), col)
ret_chum1 = move_to_front_chum(df.copy(), col)
ret_chum2 = move_to_front_chum_inplace(df.copy(), col)
ret_elpas = move_to_front_elpastor(df.copy(), col)
ret_sach = move_to_front_sachinmm(df.copy(), col)
ret_city = move_to_front_citynorman_inplace(df.copy(), col)

# Assert equivalence of solutions.
assert(ret_mine.equals(ret_chum1))
assert(ret_mine.equals(ret_chum2))
assert(ret_mine.equals(ret_elpas))
assert(ret_mine.equals(ret_sach))
assert(ret_mine.equals(ret_city))

Sonuçlar :

# For n_cols = 11:
%timeit test(move_to_front_normanius_inplace, df)
# 1.05 ms ± 42.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit test(move_to_front_citynorman_inplace, df)
# 1.68 ms ± 46.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit test(move_to_front_sachinmm, df)
# 3.24 ms ± 96.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit test(move_to_front_chum, df)
# 3.84 ms ± 114 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit test(move_to_front_elpastor, df)
# 3.85 ms ± 58.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit test(move_to_front_chum_inplace, df)
# 9.67 ms ± 101 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


# For n_cols = 31:
%timeit test(move_to_front_normanius_inplace, df)
# 1.26 ms ± 31.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit test(move_to_front_citynorman_inplace, df)
# 1.95 ms ± 260 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit test(move_to_front_sachinmm, df)
# 10.7 ms ± 348 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit test(move_to_front_chum, df)
# 11.5 ms ± 869 µs per loop (mean ± std. dev. of 7 runs, 100 loops each
%timeit test(move_to_front_elpastor, df)
# 11.4 ms ± 598 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit test(move_to_front_chum_inplace, df)
# 31.4 ms ± 1.89 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Question 6

Diğer çözümlerde diğer tüm sütunları açıkça belirtmek zorunda kalmamdan hoşlanmadım, bu yüzden bu benim için en iyi oldu. Büyük veri çerçeveleri için yavaş olsa da ...?

df = df.set_index('Mid').reset_index()

Question 7

İşte sütunların konumunu yeniden düzenlemek için sıklıkla kullandığım genel bir kod kümesi. Yararlı bulabilirsin.

cols = df.columns.tolist()
n = int(cols.index('Mid'))
cols = [cols[n]] + cols[:n] + cols[n+1:]
df = df[cols]

Question 8

Bir DataFrame'in satırlarını yeniden sıralamak için aşağıdaki gibi bir liste kullanın.

df = df[['Mid', 'Net', 'Upper', 'Lower', 'Zsore']]

Bu, kodu daha sonra okurken ne yapıldığını çok açık hale getirir. Ayrıca kullan:

df.columns
Out[1]: Index(['Net', 'Upper', 'Lower', 'Mid', 'Zsore'], dtype='object')

Ardından yeniden sıralamak için kesip yapıştırın.

Çok sayıda sütuna sahip bir DataFrame için, bir değişkendeki sütunların listesini saklayın ve istenen sütunu listenin önüne yerleştirin. İşte bir örnek:

cols = [str(col_name) for col_name in range(1001)]
data = np.random.rand(10,1001)
df = pd.DataFrame(data=data, columns=cols)

mv_col = cols.pop(cols.index('77'))
df = df[[mv_col] + cols]

Şimdi df.columnsvar.

Index(['77', '0', '1', '2', '3', '4', '5', '6', '7', '8',
       ...
       '991', '992', '993', '994', '995', '996', '997', '998', '999', '1000'],
      dtype='object', length=1001)

Question 9

İşte buna çok basit bir cevap.

Sütun adlarının etrafındaki iki (()) 'köşeli parantez'i unutmayın, aksi takdirde hata verecektir.


# here you can add below line and it should work 
df = df[list(('Mid','Upper', 'Lower', 'Net','Zsore'))]
df

                             Mid   Upper   Lower  Net  Zsore
Answer option                                                
More than once a day          2   0.22%  -0.12%   0%    65 
Once a day                    3   0.32%  -0.19%   0%    45
Several times a week          4   2.45%   1.10%   2%    78
Once a week                   6   1.63%  -0.40%   1%    65

Question 10

Deneyebileceğiniz en basit şey şudur:

df=df[[ 'Mid',   'Upper',   'Lower', 'Net'  , 'Zsore']]