问题描述:

I'm not sure how to represent a certain datastructure in Python. It consists of groups and users where each user must be a member of exactly one group and groups should be in turn contained in a container, groups and users will only be used within this container. Furthermore I need random access to groups and users. A JSON representation of example data would look like this:

{

"groupa": {

"name": "groupa",

"description": "bla",

"members": {

"usera": {

"name": "usera",

"age": 38

},

"userb": {

"name": "userb",

"age": 20

}

}

},

"groupb": {

"name": "groupb",

"description": "bla bla",

"members": {

"userc": {

"name": "userc",

"age": 56

}

}

}

}

Simply using nested dict seems unsuited because users and groups all have well defined attributes. Because Groups and Users are only used within the container I came up with a nested class:

class AccountContainer:

class Group:

def __init__(self, container, group):

self.name = group

self.members = {}

self.container = container

self.container.groups[self.name] = self # add myself to container

class User:

def __init__(self, group, user, age=None):

self.name = user

self.age = age

self.group = group

self.group.members[self.name] = self # add myself to group

def __init__(self):

self.groups = {}

def add_user(self, group, username, age=None):

# possibly check if group exists

self.groups[group].members[username] = AccountContainer.User(self.groups[group], username, age=age)

def add_group(self, group):

self.groups[group] = AccountContainer.Group(self, group)

# creating

c = AccountContainer()

c.add_group("groupa")

c.add_user("groupa", "usera")

# access

c.groups["groupa"].members["usera"].age = 38

# deleting

del(c.groups["groupa"].members["usera"])

  • How would you represent such a datastructure?
  • Is this a reasonable approach?

To me it seems a bit unnatural using a method to create a group or user while otherwise referring to dicts.

网友答案:

I think an abundance of behavior-less classes, in a multi-paradigm language (one like C++ or Python, that while supporting classes doesn't constrain you to use them when simpler structures will do), is a "design smell" -- the design equivalent of a "code smell", albeit a mild one.

If I was doing a code review of this, I'd point that out, although it's nowhere as bad as to have me insist on a re-factoring. Nested classes (that have no specific code-behavioral reason to be nested) compound this: they offer no specific benefits and can on the other hand "get in the way", for example, in Python, by interfering with serialization (pickling).

In addition to good old dicts and full-fledged classes, Python 2.6 offers the handy alternative of namedtuples for "structs" with a predefined set of attributes; they seem particularly suitable to this use case.

The handy "add this group/user to that container/group" functionality that's combined in your add... and __init__ methods can be refactored into standalone functions (so can accessors, even though that's less of a problem -- hiding internal structure into standalone accessors gets you closer to respecting the Law of Demeter).

网友答案:

It's generally good practice to not have objects know about what contains them. Pass the user in to the group, not the group into the user. Just because your current application only has "users" used once, one per group, one group per account, doesn't mean you should hardcode all your classes with that knowledge. What if you want to reuse the User class elsewhere? What if you later need to support multiple AccountContainers with users in common?

You may also get some mileage out of named tuples, especially for your users:

User = collections.namedtuple('User', ('name', 'age'))

class Group:
  def __init__(self, name, users=()):
    self.name = name
    self.members = dict((u.name, u) for u in users)

  def add(user):
    self.members[user.name] = user

et cetera

网友答案:

I would feel comfortable using dicts. But I'd put the content in lists as a list instead of a dict will keep it clean and less redundant:

[
 {
  "name": "groupa",
  "description": "bla",
  "members": [{"name": "usera", "age": 38},
             {"name": "userb","age": 20}]
  },
  {
   "name": "groupb",
   "description": "bla bla",
   "members": [{"name": "userc","age": 56}]
  }
]

Update:

You can still use random elements by the use of the random module:

groups_list[random.randrange(len(group_list))] 
网友答案:

To echo Alex's answer.... these nested classes reek of code smell to me.

Simpler maybe:

def Group(name=None,description=None,members=None):
    if name is None:  
        name = "UNK!" # some reasonable default
    if members is None:
        members = dict()
    return dict(name = ...., members = ....)

In your original proposal, your objects are just glorified dicts anyway, and the only reason to use objects (in this code) are to get a cleaner init to handle empty attributes. Making them into functions that return actual dicts is nearly as clean, and much easier. Named-tuples seem like an even better solution though, as previously pointed out.

This (nested dicts approach) has the benefit of being trivial to construct from /dump to json.

相关阅读:
Top