Strawberry Fields

I was working on getting some work ported over to Strawberry from Graphene-Django, and I suddenly hit a snag. Once I found out what happened to the Strawberry Fields, I was just glad I could solve it at that point.

The premise

So the premise is that I worked on a project that was written with Graphene Django, for Django obviously, that essentially is the Python framework implementation behind a GraphQL schema. So you have a GraphQL endpoint where you send your queries and mutations and it will validate the schema and execute the queries and mutations. The two problems with the framework are it is not fully async and it is a lot of meta class programming that implies a lot of configuring but not a lot of control over the implementation details.

This meant when we were hitting our performance issues, we tried everything from low-hanging fruit like using Dataloaders, to optimizing certain queries by not relying on the ORM (Graphene Mongo in this case) and just using PyMongo directly.

Still we got stuck again, and the fact we relied heavily on the Promise implementation in Python to match the Promises/A+ from Javascript land did not help actually. Also there tried to improve some things by making things more streamlined in their concurrency, but alas. So the move is to another framework, preferably in Python, to keep the current dev team.

Enter Strawberry

So Strawberry is a nice framework that does both sync and async and it is a lot of configuring as well, but also lets you control the implementation details. For instance through things called FieldExtensions. These can either be run when the schema is first generated (through the apply method) or when the nodes are being resolved (through either sync resolve or async resolve_async functions). This is a wonderful way to, through a middleware type approach, have a way to tweak the implementation details.

One of the things that kind of was lacking in the Graphene Django implementation setup was a nice automatic control of what fields were allowed to be given as filters. Ideally it should just be always the fields exposed on the Node itself. That however was not the case, you had to manually make it so. Trying to make the new stack better and fixing that particular nuisance, I made a simple BaseExtension class:

class BaseExtension(FieldExtension):

    def apply(self, field: StrawberryField) -> None:
        self.filter_fields = []
        resolved_type: Type[WithStrawberryObjectDefinition] = cast(
            Type[WithStrawberryObjectDefinition], field.resolve_type()
        )
        if resolved_type.__strawberry_definition__.specialized_type_var_map:
            node = cast(
                Type[WithStrawberryObjectDefinition],
                resolved_type.__strawberry_definition__.specialized_type_var_map[
                    "NodeType"
                ],
            )

        for f in node.__strawberry_definition__.fields:
            field.arguments.append(
                StrawberryArgument(
                    python_name=f.name,
                    graphql_name=f.name.replace("_", "")
                    if f.name.startswith("_")
                    else None,
                    type_annotation=StrawberryAnnotation(
                        Optional[f.type]
                        if not isinstance(f.type, StrawberryOptional)
                        else Optional[f.type.of_type]
                    ),
                    description="",
                    default=strawberry.UNSET,
                )
            )
            self.filter_fields.append(f.name)

All this really does is go over the fields defined and add them all as Arguments so that you can filter on them and they will be passed along as kwargs in the resolve functions.

Perfect.

Snag time

So I was porting the NodeTypes and this project also uses Relay. It is a certain implementation of GraphQL itself. Not very important, except for me to say now I had not made any connections yet. As in one Node –> another Node. Which is quite common in GraphQL and in Relay.

When I made the first connection, the schema would not even generate. I was so frustrated, and nothing worked. I could go from resolver function -> relay.ListConnection[NodeType] but not from Node -> relay.ListConnection[NodeType]. It kept complaining about it not being a GraphQLInput Type. I did not want it as an input. I struggled and looked deep into the source code of everything, trying to hack it there. Making it dynamically an input or an output depending on properties, and I suddenly stopped. Since there was no mention of this online whatsoever it had to be a problem I caused and created.

I went to bed, late. Woke up. Paced around a bit. In my head I thought, why is it automatically turning into an argu.....oh I am an idiot.

So I revisited my BaseExtension field that powered my dynamic argument adding stuff. I tweaked it here and there and the following is the fixed version:

class BaseExtension(FieldExtension):
    filter_fields = ["project_id", "change_order_id"]

    def apply(self, field: StrawberryField) -> None:
        self.filter_fields = ["project_id", "change_order_id"]
        resolved_type: Type[WithStrawberryObjectDefinition] = cast(
            Type[WithStrawberryObjectDefinition], field.resolve_type()
        )
        if resolved_type.__strawberry_definition__.specialized_type_var_map:
            node = cast(
                Type[WithStrawberryObjectDefinition],
                resolved_type.__strawberry_definition__.specialized_type_var_map[
                    "NodeType"
                ],
            )
        else:
            node = resolved_type

        for f in node.__strawberry_definition__.fields:
            if inspect.isclass(f.type) and issubclass(
                f.type, strawberry.relay.types.ListConnection
            ):
                continue
            if isinstance(f.type, StrawberryOptional):
                if inspect.isclass(f.type.of_type) and issubclass(
                    f.type.of_type, strawberry.relay.types.ListConnection
                ):
                    continue
            field.arguments.append(
                StrawberryArgument(
                    python_name=f.name,
                    graphql_name=f.name.replace("_", "")
                    if f.name.startswith("_")
                    else None,
                    type_annotation=StrawberryAnnotation(
                        Optional[f.type]
                        if not isinstance(f.type, StrawberryOptional)
                        else Optional[f.type.of_type]
                    ),
                    description="",
                    default=strawberry.UNSET,
                )
            )
            self.filter_fields.append(f.name)

Essentially what I needed to do was check if the type or of_type is a class. If it is check if it is a relay.ListConnection type class and then exclude it from the argument generation. All worked right after this.

Conclusion

I really like this framework. It gives me insight into how they operate and why sometimes a particular query is slow, and they give you the space to fix it. For example I already fixed the fact that we can load all the necessary subparts in one go from a node with the Dataloaders. That was not possible before. However it was still as slow as the old stack, because each node on it's own tried to create this new relay.ListConnection for one Edge essentially.

We already have all the instances needed to make all the edges when doing the Dataloader logic, so implement in that particular spot also the creation of all the edges in one go. Then have a simple mapping of node.id -> Edge and you are done. This sped up things by quite a significant margin.

Something the old stack could not really do. It had no real way of giving you the same tools to do the same thing.

#devlife #python #graphql