问题描述:

if I have the following, how do I make pd.DataFrame() turn this array into a dataframe with two columns. What's the most efficient way? My current approach involves creating copies out of each into a series and making dataframes out of them.

From this:

([[u'294 (24%) L', u'294 (26%) R'],

[u'981 (71%) L', u'981 (82%) R'],])

to

x y

294 294

981 981

rather than

x

[u'294 (24%) L', u'294 (26%) R']

my current approach. Looking for something more efficient

numL = pd.Series(numlist).map(lambda x: x[0])

numR = pd.Series(numlist).map(lambda x: x[1])

nL = pd.DataFrame(numL, columns=['left_num'])

nR = pd.DataFrame(numR, columns=['right_num'])

nLR = nL.join(nR)

nLR

UPDATE**

I noticed that my error simply comes down to when you pd.DataFrame() a list versus a series. WHen you create a dataframe out of a list, it merges the items into the same column. Not so with a list. That solved my problem in the most efficient way.

网友答案:
In [172]: data = [[u'294 (24%) L', u'294 (26%) R'],  [u'981 (71%) L', u'981 (82%) R'],]

In [173]: clean_data = [[int(item.split()[0]) for item in row] for row in data]

In [174]: clean_data
Out[174]: [[294, 294], [981, 981]]

In [175]: pd.DataFrame(clean_data, columns=list('xy'))
Out[175]: 
     x    y
0  294  294
1  981  981

[2 rows x 2 columns]
相关阅读:
Top