marsilea.upset.UpsetData#
- class marsilea.upset.UpsetData(data, sets_names=None, items=None, sets_attrs=None, items_attrs=None)#
Bases:
objectHandle multiple sets
Normally, the construction methods are used to create a UpsetData
- Terminology that you might not be familiar with:
set: a collection of unique items
subset: intersection of sets
degree: The number of sets that intersect with each other
cardinality: The number of items in the subset
- Parameters:
- databool matrix
A one-hot encode matrix indicates if an item is in a set. Columns are sets and rows are items
- sets_namesoptional, array
The name of sets
- itemsoptional, array
The name of items
- sets_attrsoptional, pd.DataFrame
The attributes of sets, the input index should be the same as names
- items_attrsoptional, pd.DataFrame
The attributes of items, the input index should be the same as items
Examples
from marsilea.upset import UpsetData sets = [[1,2,3,4], [3,4,5,6]] data = UpsetData.from_sets(sets)
- binary_table()#
- cardinality()#
The number of items in intersections
- degree()#
Intersection between how many sets
- filter(min_degree=None, max_degree=None, min_cardinality=None, max_cardinality=None)#
Filter by degree or cardinality
- Parameters:
- min_degreeint
The minimum degree
- max_degreeint
The maximum degree
- min_cardinalityint
The minimum cardinality
- max_cardinalityint
The maximum cardinality
- classmethod from_memberships(items, items_names=None, sets_attrs=None, items_attrs=None)#
Describe the sets an item are in
- Parameters:
- itemsarray of array of sets_names, dict
The data of items
- items_namesoptional
The name of items, if name is not provided, it will be automatically named as “Item 1, Item 2, …”
- sets_attrsoptional, pd.DataFrame
The attributes of sets, the input index should be the same as sets_names
- items_attrsoptional, pd.DataFrame
The attributes of items, the input index should be the same as items
- classmethod from_sets(sets, sets_names=None, sets_attrs=None, items_attrs=None)#
Create UpsetData from a series of sets
- Parameters:
- setsarray of sets, dict
The sets data
- sets_namesoptional
The name of sets, if name is not provided, it will be automatically named as “Set 1, Set 2, …”
- sets_attrsoptional, pd.DataFrame
The attributes of sets, the input index should be the same as sets_names
- items_attrsoptional, pd.DataFrame
The attributes of items, the input index should be the same as items
- get_items_attr(attr)#
Return the attribute of items in the order of plotting
- Parameters:
- attrkey in items_attrs
Retrieve the attribute of items
- Returns:
- array of attribute
- has_item(item)#
Return a list of sets’ name the item is in
- intersection(sets_name)#
Return the items that are shared in different sets
- intersection_count()#
The item has occurred in how many sets
- property items#
- property items_attrs#
- mark(present=None, absent=None, min_cardinality=None, max_cardinality=None, min_degree=None, max_degree=None)#
- reset()#
- property sets_attrs#
- property sets_names#
- sets_size()#
- sets_table()#
- sort_sets(ascending=False, order=None)#
Control the order of sets
- Parameters:
- ascendingbool
Sort in ascending order if True
- orderlist
Explicitly specify the order of sets
- sort_subsets(by='degree', ascending=False)#
Sort the subsets by degree or cardinality
- Parameters:
- bystr
Sort by either degree or cardinality
- ascendingbool
Sort in ascending order if True