Find a next book to translate

No Employees
Founders Code
Solo Founder

I want to make book publishing effective by providing technical and analytical tools


#Ask-IH I was thinking about pricing. I don't get a clue.

I doing a service to find a next book to translate for a non-US publisher. So it's a B2B market. I don't have any sales yet. However, I have some interest and discussion with the potential client. I want to estimate what could be the price. Both for the "MVP" and for a more or less final point.

This is the info I collected so far:

  1. There is a market for such a service. Literature scouting agencies are providing such a service.
  2. Agencies exclusive contracts. It's one client per country per book type (child, fiction, non-fiction). It could be around 10-30 clients in total for an agency.
  3. I get a piece of information that for the UK market it costs around 10-20k$ per year for an average publisher.
  4. In Russia, an in-house scout for this costs around 15-20k$ / year. Publicly known about one publisher. It has a team of scouts working on a task.

Should I anchor the price versus costs of an agency? Or should I, perhaps, base the price on the cost of an in-house scout?

My main goal is to validate the business model right now. What is the price for MVP serving this goal?

What are your thoughts? How would've you approach this?

Demo day

Unfortunately, an expert said that he needs more time. Meanwhile, I did look through books myself and it felt good enough to share.

I chose five genres relevant to a publisher in the communication. I pick a book for every genre rated high by the model. And send those books to the chief editor asking for a comment. Fingers crossed.

For those who interested in books, the list is below.






Authors data

I added data about authors history:

  • How experienced he is?
  • Did he have successful books before?

And other similar features

This, however, didn't show any metric boost.

Using data about translations to other languages

I finally collected and added data about known translations for a book. This both helps with a modeling quality and will help me to lower amount of manual work to filter a book.

Tomorrow I will have a session with an expert, former book scout. I'll hope to get some intuition on how manual selection is done. Also, I will validate the first books selected automatically. I will move to a demonstration phase right after.

A 25k+ dataset of translations

I decided to publish a dataset of correspondence between original books and translations to the Russian language. It's about 25k books I found so far.

Correspondence is automatic. It could have some errors, however, it looks quite clean.

I found an expert

I found an expert to help me validate the model results.

He is a former book scout. It means that he selected a lot of English books to translate. He agreed to make a 2 hours session to discuss some example books. So, he could give me some understanding of how he approaches this task. I believe this consultation could advance me a lot.

Additionally, I found a working solution for dataset imbalance. The idea of the solution is to add originally unpopular books as if they have no success after translation. This is ugly. Yet it definitely leads to better results.

Additional validation

Today I had a discussion with my first potential client. It was both quite hard and reassuring. It was hard because the person on the other side said: "OK, assuming you could deliver what we are discussing. So, what do you want?" I wasn't ready to actually discuss it. Then he said that they actually would pay for it would be reasonably priced.

And right after that "You know, this is an information service, so we would like to have exclusivity on it". At this point, I was completely stuck.

Does it mean that I need to do something exclusively for them? What could I possibly answer?

Luckily, it was not a call, but a WhatsApp chat. It was natural to take some time to think about it.

After this, I took part in IH global meetup. And I actually ask my specific question. What could I ask? And get some ideas.

  • Maybe you can offer some part for all the market and develop some customization for them?
  • Maybe you don't need them as a client.
  • Maybe it's ok to do it exclusively and you need to charge more money for this service?

It suddenly unstuck me. I actually started to think "What would be good for the client"? I quickly come to an idea that it's actually a good idea to give information exclusively. But it could be exclusive by genre or even exclusive on book level. It doesn't necessarily imply working with one publisher.

It was a huge relief, thanks for this, IH community.

An alpha test

Finally, I come to a point where I have some data to test on. The results, however, look controversial.

The issue is probably with having mostly very popular books in the train set.

Now, it's not clear if the model is working at least OK, and it's hard to judge by myself. It's not clear if I really want to present something I not sure of.

Thinking a lot.

Amazon is definitely don't want to give their data

After some scraping, I found that Amazon started to response with Robot check page after about 1000 queries.

I don't really want to get into it right now. Luckily, I have an alternative.

More work on the model

Setup a proper time-based validation: Now I build a model with historical data before 2018. Validate a model with data of the years 2018 and 2019.

More than an hour I believed that the model has no predictive power according to new validation. And then I found a bug. Phew!

Currently, the best feature for a model is the number of editions. No surprise. This feature is obviously correlated with the number of countries book was published in.

The other interesting thing is that the average book rating has little correlation with book translation popularity.

Now it's close to ready for an alpha test.

I want to make book publishing effective by providing technical and analytical tools