235: pytest-django - Adam Johnson
Brian: All right.
Adam: Hello.
Brian: Welcome to Test and Code, Adam Johnson today. Welcome, Adam.
Adam: Thank you for having me back.
Brian: I've been kind of looking at a bunch of pytest plugins lately, and everybody asks me about pytest-django, and I am learning Django,
Brian: but I'm really not the expert of Django yet. So let's jump into pytest-django.
Adam: All right, cool.
Brian: A lot of people were using unittest with Django first, and it's bundled with unittest, right?
Adam: Yeah, exactly. Django has its own testing framework that takes unit test and extends it in a few different ways, like hooking up Django's databases to manage them and adding a few niceties that unit test is missing. But pytest gives you like dash dash PDB to open a Python debugger when a test fails.
Brian: For the pytest-django plugin, I was looking at the top pytest plugins list, and currently pytest-django is number 12, so it's very widely used, I think. Do you recommend it to everybody that's using Django, or what do you think about that?
Adam: Nearly every project I've worked on in the last 5-10 years has used pytest-django. There have been a few where they've started with Django's testing framework because the tutorial in Django tells you how to do that because it's easier, it's enough thing. And they've not managed to migrate to pytest. If you start doing some weird stuff within the test framework, then it's hard to migrate to run things under pytest. So there are occasional projects that do you still use just Django's built-in test framework. but it's probably 80, 90% plus that are using pytest.
Adam: Cool.
Brian: What do we get with the pytest-django plugin?
Adam: Most of it is trying to imitate or replicate what Django's test framework does. So setting up of databases is a key one. There's also some interaction there with pytest XDist for parallelization.
Brian: You can do that with the Django thing?
Adam: Yeah, so Django's test framework, the one built into Django, adds parallelization on top of unit tests.
Brian: Okay.
Adam: pytest-django integrates with pytest XDist, so it knows when you're doing parallel tests and sets up databases with different names for each parallel worker.
Brian: Okay, cool.
Adam: And then it also tries to re-expose some of those features from Django's test framework. Like there's some extra assertion functions in Django's test case classes. So pytest-django has like a camel-cased version of these, not a non-camel-case version. I mean, like unit test has a camel-case version. pytest-django tries to rewrites them as snake-case versions.
Brian: Okay well that's nice but like is that that part of it i was just curious is that is that something that like drives you nuts about unit test because for me personally yeah i like the the snake case, That's not really one of the reasons why I don't use unit test. I know the history behind it because it came from the Java land before we really settled on snake case everywhere. But that doesn't really bug me that much. I mean, it's slightly annoying, but I guess I'm leading the question, leading the answer, but is that part that much of a big deal to you?
Adam: That is not for me. In fact, what I tend to do is still use the test case classes out of Django, the unit test case classes, not all of the assertion functions. So I'm not writing the assert equal, like I'm happy to use the pytest one. But for the extra ones that Django provides, like assert in HTML, for example, that lets you check for an HTML fragment in a larger HTML document, I'm happy using the camel case ones.
Brian: So you can use that with the pytest-django? You can mix and match, I guess, a little bit?
Adam: Yeah, that function is available from within pytest-django, too, like a wrapper.
Brian: Oh, okay, cool. Neat. Is it difficult to switch to pytest-django?
Adam: In the typical case, it's not that hard. You're going to install pytest, pytest-django. There's one configuration value you'll need to set, which is pointing at the Django settings module that you're using in tests. So that's something that when you run Django's test framework, that's already set up because you would just run Django's test command and it has the settings. But when we run pytest, that's outside of Django's flow. So you need to explicitly tell it where to load the settings from. And in a fairly reasonable project, like without any testing extras, then that's going to work for you.
Brian: Okay, and if I look at the, if I look at pytest-django's documentation, it's right there. There's a getting started page and it just tells you right that, like right off the bat.
Brian: That's cool. Nice. Now, I can't remember. Do you help out with the pytest-django maintenance?
Adam: I am on there as a maintainer. I've made a handful of pull requests in the last few years. Okay. I'm kind of happy there as backup. I'm reviewing the occasional thing in case the project goes dormant. I can probably step up a bit more. I know some of the people involved through conferences, so I've met them in
Adam: real life. others just from their activities so.
Brian: I know that it's used by lots of people uh like you said but that you're using it every day makes me feel better actually something that's really important is the the database so having the database set up what does that look like is it is it per function can you can you set the scoping of when the database is set up or how do you deal with that.
Adam: Right so there's there's kind of two parts there right there's the getting the schema in shape and then there's loading data. So in terms of the schema, Django's behavior, which is what pytest-django copies, is it creates a new database when you run tests. And it basically takes the name of the database that you've configured in your settings and adds underscore test. Or in the case of parallelization, it adds an extra little bit about the worker number that we're on. And then it needs to run Django's migration process, which runs through a series of files that are written in Python that tell Django how to set up your database, like create a table, add these columns. And this is Django's migrations framework. And once that's done on all of your test databases for all your parallel processes, or your single one if you're in single mode, then it's ready to be used in each test. And then within tests, there's a couple layers of transactions to allow you to write test data and use it within tests, and then the transactions get rolled back afterwards so the database is left clear.
Brian: Oh, okay. So it doesn't completely tear down and recreate the database for every test?
Adam: No.
Brian: Oh, that's cool. So that's probably pretty fast.
Adam: Yes, you can be pretty fast even though you're using a real SQL database thanks to this. There are tests that if you need to have a transaction that commits, you can use a function called a class in Django called a transaction test case or the transaction DB transaction fixture in pytest-django to have it instead of like rolling back the transaction it goes and like flushes out all the tables and that is a lot slower and it gets slower the more tables you have, so a key thing in keeping your test fast is avoiding that path whenever possible and making sure that things work with rollbacks.
Brian: Okay. Okay, now I'm just thinking outside the box. So if there's not much in there, though, so let's say most of your tests are rolling stuff back. If there's not much in there and you run into a test that does need to test transactions, wouldn't there just not be that much there to do?
Adam: The problem is that after you've committed a transaction, you can't go back and ask the database, hey, which rows did I write to which tables? Or indeed update, that data is lost. So Django goes and flushes every table. I did come up with a patch that was rejected due to not working on every database backend. That would do a bit more of a general query to find out which tables have rows, which would have sped that process up a bit, but it can't be sped up massively.
Brian: Okay, so let's say I've got... Like it really is good to test with like a, like let's say I've got a bunch of users and transactions and all, all sorts of stuff, not transactions, bad term, but a lot of data in there that for my fixture or database fixture to, to set it all up. And then, and then most of my tests are just going to roll back. But if I, if I was thinking if I really need to test the transaction part, I probably would have a separate chunk that used a small database then that didn't have a bunch of data in it, that might be easier because you're going to throw everything away and recreate it. If there's not much there, maybe that would... Anyway, I'll leave that to the experts.
Adam: That does seem worth thinking about.
Brian: I don't know if we covered this or not. That's one of the things that the two names for fixtures. Does Django still talk about the startup data as fixtures or is there a different name?
Adam: There isn't a different name. It's still there, and it's generally not used by many people, I think. So to explain to listeners, the fixture system there is so you can have some data written out in files that are written in JSON or YAML or XML. It has several formats supported out of the box. and the kind of idea here was that you'd have django build your database and then you could load some like sample data straight into the database that's maintained in some files that you know describe this field says this this field says this this foreign key points to that row, i think that's generally like fallen out of favor as we realize it's like quite painful to manually update these fixture files when you make schema changes and keep that all up to date. And there's also the power of writing these factories, which you might have discussed on the podcast, Factory Boy, or other libraries that can populate your fields with random data. So you do not need to maintain exactly what goes in which field to still have a bunch of data. okay so.
Brian: That we don't people generally don't use the fixture functionality of django then or isn't.
Adam: Recommended i don't think it's used that much in there's i don't think it's as far as the docs say don't use this and i'm sure there's like a lot of use cases out there still for like sharing data between multiple production systems or something so it's probably not it's never going to go away, but for tests, I don't think many people use it now.
Brian: Okay, so what are people using for tests? Like Factory Boy or something?
Adam: Yeah, I think that's the main one that I see used.
Brian: And let's say I had I don't even know if this is a good idea. Just picking it at random. If I wanted to take a chunk of my live data, like the snapshot of my live data and run with that, is that possible with all of this? Or, Just a bad idea.
Adam: It's a bad idea, I think, especially if I were the privacy laws and whatnot.
Brian: Okay.
Adam: And writing tests against data that, like, what, is it going to change the next time you dump your database out? Very flimsy test then. I think it would be possible, like, taking a SQL dump and then writing your migration process to load that after all the tables are set up. Okay. But yeah, I don't think it's a great idea.
Brian: Okay, that's fair. Well, I'm convinced, and so my Django project will be,
Brian: I mean, I guess I didn't need too much convincing, but definitely use pytest-django. Anything about this plugin that we should cover that we haven't touched on yet?
Adam: Yeah, so I talked about the migration process. That can get very slow in large projects. As you make more and more changes to your database, this history of migration operations just grows. Django has some functions for cutting this back called squashing, and there are other approaches, but it can still take a while and even be a significant part of your test run. More than half the time is spent just building the database. So to leave this, both Django's test framework and pytest-django provide ways to preserve the database between runs. With the idea that if it's being left in a clean state after every test, then it's in a clean state after the end of a test run. So you can just start again with that. So for pytest-django, that's dash dash reuse dash DB.
Brian: Okay. So for CI and stuff, can I save that? Can I give it a path or something to figure out, like, to save it somewhere?
Adam: It comes with caveats caveats so I think you would want to avoid doing that on CI you want to make sure you're testing with a perfectly migrated database, and yeah the big issue there is if you're switching between branches of your code oh right you're going to leave the database in a state maybe where it has a new field that you're working on on feature A but now you're going to feature B where oh all of the tests do not add that field so they will just crash so because okay so.
Brian: We want to have the build in our process, the ability to clean that out or run without that if we're changing a database structure.
Adam: Yeah, exactly.
Brian: Okay, that makes sense. Cool. Anything else?
Adam: Yeah, I touched on using Django's test case classes there. I prefer that as opposed to an option, which is to just write pytest functions for your Django tests. And I think it's for two main reasons. One is the writing classes, I think, is still great for organizing tests. So a bunch of tests related to one Django view can go under one view class, and then a view test class, and then you know that they're related. But the other one is that some of those features from Django's test case classes aren't available in pytest-django the key one being the setup test data, which lets you set up test data in an outer transaction that covers all the tests in one class, so you save some time rather than resetting them for every single test you set them up once and then the inner transaction for each test rolls back to that clean state.
Brian: Well that's cool so if I'm using the test case classes it's mostly using the pytest of Django as a test runner then right yeah that works.
Adam: But I.
Brian: Mean you can also use like some of the extra features like marks and.
Adam: Yeah definitely and.
Brian: What not does parameterization make sense for a test for Django yeah.
Adam: It depends on what level we're talking. It is not really that useful when we're talking about a whole request-response cycle. I don't think that's very useful because generally you want to test different scenarios there. You don't want to test some simple input-output kind of thing. But when we get closer to just the database layer or other parts of Django, like the template engine, then you do get more kind of functional style. So given this input, get that output, and then parameterization does make sense.
Brian: Well, okay, so if you're using the test case class from unitest, what is it about pytest that we're using that, why are we using pytest at all?
Adam: It is a way superior runner. It lets you write their plain assert statements. It's got the much better debugging tools like the PDB. Okay.
Brian: Yeah, and then all the other extensions, we can extend it to write different data output and all sorts of things like that. All right, well, cool. That's pretty exciting. Yeah, no, definitely think it's neat. And I also think it's pretty cool that it's being used so much. When I first started thinking about testing web stuff, And it was a little disheartened, actually, that Django came with its own unit test extension. I'm like, well, I'm not going to be able to convince Django people to use pytest. But apparently they're already convinced. So that's pretty cool. Nice. Well, thanks, Adam. And we'll talk to you next time.
Adam: Thank you.
Creators and Guests

