大榔頭的電腦隨筆: Python小技巧-串列與集合

匿名變數

匿名變數的符號為「_」，表示沒有名稱的變數，如此可以避免為變數命名的困擾，例如：

data = ['david', 18, '68kg', '180cm']

_, age, weight, _ = data

print(age) #18

我們只要 data 串列的「18」及「180cm」，所以使用 2 個匿名變數「_」。

「*變數名稱」代表任意長度元素

例如：

data = ['david', 87, 91, 80, 54, 'B']

name, *score, grade = data

print(score) # [87, 91, 80, 54]

成績數量不一定，用「*score」可代表任意長度分數。

找出串列中最大或最小的 N 個元素

使用 heapq 模組的 nlargest() 及 nsmallest() 可取得串列中最大或最小的 N 個元素。例如：

import heapq

data = [23, 87, 91, 80, 54, -4, 72, 19, 83]

print(heapq.nlargest(3, data)) #[91, 87, 83]

print(heapq.nsmallest(3, data)) #[-4, 19, 23]

可加入「key」參數針對某項目取得元素，例如：

import heapq

portfolio = [

{'name': 'IBM', 'price': 91.1},

{'name': 'AAPL', 'price': 543.22},

{'name': 'FB', 'price': 21.09},

{'name': 'HPQ', 'price': 31.75},

{'name': 'YHOO', 'price': 16.35}

]

cheap = heapq.nsmallest(2, portfolio, key=lambda s: s['price'])

expensive = heapq.nlargest(2, portfolio, key=lambda s: s['price'])

print(cheap) #[{'name': 'YHOO', 'price': 16.35}, {'name': 'FB', 'price': 21.09}]

print(expensive) #[{'name': 'AAPL', 'price': 543.22}, {'name': 'IBM', 'price': 91.1}]

保持字典的順序

字典中的元素順序是由系統決定，與輸入的順序不同。使用 collections 的 OrderedDict 模組即可控制字典的順序，例如：

from collections import OrderedDict

order = OrderedDict()

order['eng'] = 98

order['mat'] = 60

order['nat'] = 76

order['his'] = 80

for key in order:

print(key, order[key])

“”” 結果：

eng 98

mat 60

nat 76

his 80

“””

這樣的順序在轉換為 JSON 時，也可控制順序：例如使用上面的 order 字典：

import json

print(json.dumps(order)) #{"eng": 98, "mat": 60, "nat": 76, "his": 80}

以字典的值取最大值、最小值、排序

對字典進行取最大值、最小值、排序操作時，是針對「鍵」進行操作。例如：

score = {

'eng': 98,

'mat': 60,

'nat': 76,

'his': 80

}

print('minimun:', min(score)) #minimun: eng

print('maximun:', max(score)) #maximun: nat

print('sort:', sorted(score)) #sort: ['eng', 'his', 'mat', 'nat']

如果是要對「值」操作，可使用 values 方法：

print('minimun:', min(score.values())) #minimun: 60

但通常我們針對「值」操作時，希望同時取得「鍵」及「值」：先利用 zip() 將「鍵」及「值」對調，再進行操作。

zipscore = zip(score.values(), score.keys())

print('minimun:', min(zipscore)) #minimun: (60, 'mat')

zipscore = zip(score.values(), score.keys())

print('maximun:', max(zipscore)) #maximun: (98, 'eng')

zipscore = zip(score.values(), score.keys())

print('sort:', sorted(zipscore)) #sort: [(60, 'mat'), (76, 'nat'), (80, 'his'), (98, 'eng')]

要注意 zip() 建立的變數只能使用一次，所以再次使用要重新建立。

計算元素出現次數

Collections模組的 Counter 類別會統計串列中相同元素出現的次數，這在統計常用文字相當有用。例如：

from collections import Counter

words = [

'look', 'into', 'my', 'eyes', 'look', 'into', 'my', 'eyes',

'the', 'eyes', 'the', 'eyes', 'the', 'eyes', 'not', 'around', 'the',

'eyes', "don't", 'look', 'around', 'the', 'eyes', 'look', 'into',

'my', 'eyes', "you're", 'under'

]

word_counts = Counter(words)

print(word_counts) #{'eyes': 8, 'the': 5, 'look': 4, 'into': 3, 'my': 3, 'around': 2, 'not': 1, "don't": 1, "you're": 1, 'under': 1}

傳回值是字典，可取得指定元素出現次數，例如：

print(word_counts['into']) #3

most_common(n) 方法可取得出現次數最多的 n 個項目，例如：

print(word_counts.most_common(3)) #[('eyes', 8), ('the', 5), ('look', 4)]

排序字典構成的串列

通常由資料庫讀取資料或由網路取得 JSON 資料時，會得到由字典組成的串列。若要以指定「鍵」進行排序，可使用 operator 模組的 itemgetter 方法達到此排序目的。例如：

from operator import itemgetter

rows = [

{'fname': 'Brian', 'lname': 'Jones', 'uid': 1003},

{'fname': 'David', 'lname': 'Beazley', 'uid': 1002},

{'fname': 'John', 'lname': 'Cleese', 'uid': 1001},

{'fname': 'Big', 'lname': 'Jones', 'uid': 1004}

]

print(sorted(rows, key=itemgetter('uid')))

“”” 結果：

[{'fname': 'John', 'lname': 'Cleese', 'uid': 1001},

{'fname': 'David', 'lname': 'Beazley', 'uid': 1002},

{'fname': 'Brian', 'lname': 'Jones', 'uid': 1003},

{'fname': 'Big', 'lname': 'Jones', 'uid': 1004}]

“””

也可以輸入多個排序「鍵」，例如先以 lname 排序，相同 lname 再以 fname排序：

print(sorted(rows, key=itemgetter('lname','fname')))

此功能使用 lambda 運算式也能達成：

print(sorted(rows, key=lambda r: r['uid']))

不過使用itemgetter 的效能會使用 lambda 好。

min() 及 max() 方法也可使用 itemgetter 及 lambda 取得指定項目的最大及最小值。

大榔頭的電腦隨筆

2018年10月29日星期一

Python小技巧-串列與集合

沒有留言:

張貼留言

2018年10月29日 星期一

Python小技巧-串列與集合

沒有留言:

張貼留言

2018年10月29日星期一