marsilea.upset.UpsetData#

class marsilea.upset.UpsetData(data, sets_names=None, items=None, sets_attrs=None, items_attrs=None)#

Bases: object

Handle multiple sets

Normally, the construction methods are used to create a UpsetData

Terminology that you might not be familiar with:
  • set: a collection of unique items

  • subset: intersection of sets

  • degree: The number of sets that intersect with each other

  • cardinality: The number of items in the subset

Parameters:
databool matrix

A one-hot encode matrix indicates if an item is in a set. Columns are sets and rows are items

sets_namesoptional, array

The name of sets

itemsoptional, array

The name of items

sets_attrsoptional, pd.DataFrame

The attributes of sets, the input index should be the same as names

items_attrsoptional, pd.DataFrame

The attributes of items, the input index should be the same as items

Examples

from marsilea.upset import UpsetData
sets = [[1,2,3,4], [3,4,5,6]]
data = UpsetData.from_sets(sets)
binary_table()#
cardinality()#

The number of items in intersections

degree()#

Intersection between how many sets

filter(min_degree=None, max_degree=None, min_cardinality=None, max_cardinality=None)#

Filter by degree or cardinality

Parameters:
min_degreeint

The minimum degree

max_degreeint

The maximum degree

min_cardinalityint

The minimum cardinality

max_cardinalityint

The maximum cardinality

classmethod from_memberships(items, items_names=None, sets_attrs=None, items_attrs=None)#

Describe the sets an item are in

Parameters:
itemsarray of array of sets_names, dict

The data of items

items_namesoptional

The name of items, if name is not provided, it will be automatically named as “Item 1, Item 2, …”

sets_attrsoptional, pd.DataFrame

The attributes of sets, the input index should be the same as sets_names

items_attrsoptional, pd.DataFrame

The attributes of items, the input index should be the same as items

classmethod from_sets(sets, sets_names=None, sets_attrs=None, items_attrs=None)#

Create UpsetData from a series of sets

Parameters:
setsarray of sets, dict

The sets data

sets_namesoptional

The name of sets, if name is not provided, it will be automatically named as “Set 1, Set 2, …”

sets_attrsoptional, pd.DataFrame

The attributes of sets, the input index should be the same as sets_names

items_attrsoptional, pd.DataFrame

The attributes of items, the input index should be the same as items

get_items_attr(attr)#

Return the attribute of items in the order of plotting

Parameters:
attrkey in items_attrs

Retrieve the attribute of items

Returns:
array of attribute
has_item(item)#

Return a list of sets’ name the item is in

intersection(sets_name)#

Return the items that are shared in different sets

intersection_count()#

The item has occurred in how many sets

property items#
property items_attrs#
mark(present=None, absent=None, min_cardinality=None, max_cardinality=None, min_degree=None, max_degree=None)#
reset()#
property sets_attrs#
property sets_names#
sets_size()#
sets_table()#
sort_sets(ascending=False, order=None)#

Control the order of sets

Parameters:
ascendingbool

Sort in ascending order if True

orderlist

Explicitly specify the order of sets

sort_subsets(by='degree', ascending=False)#

Sort the subsets by degree or cardinality

Parameters:
bystr

Sort by either degree or cardinality

ascendingbool

Sort in ascending order if True