#Ask-IH I was thinking about pricing. I don't get a clue.
I doing a service to find a next book to translate for a non-US publisher. So it's a B2B market. I don't have any sales yet. However, I have some interest and discussion with the potential client. I want to estimate what could be the price. Both for the "MVP" and for a more or less final point.
This is the info I collected so far:
Should I anchor the price versus costs of an agency? Or should I, perhaps, base the price on the cost of an in-house scout?
My main goal is to validate the business model right now. What is the price for MVP serving this goal?
What are your thoughts? How would've you approach this?
Unfortunately, an expert said that he needs more time. Meanwhile, I did look through books myself and it felt good enough to share.
I chose five genres relevant to a publisher in the communication. I pick a book for every genre rated high by the model. And send those books to the chief editor asking for a comment. Fingers crossed.
For those who interested in books, the list is below.
I added data about authors history:
And other similar features
This, however, didn't show any metric boost.
I finally collected and added data about known translations for a book. This both helps with a modeling quality and will help me to lower amount of manual work to filter a book.
Tomorrow I will have a session with an expert, former book scout. I'll hope to get some intuition on how manual selection is done. Also, I will validate the first books selected automatically. I will move to a demonstration phase right after.
I decided to publish a dataset of correspondence between original books and translations to the Russian language. It's about 25k books I found so far.
Correspondence is automatic. It could have some errors, however, it looks quite clean.
I found an expert to help me validate the model results.
He is a former book scout. It means that he selected a lot of English books to translate. He agreed to make a 2 hours session to discuss some example books. So, he could give me some understanding of how he approaches this task. I believe this consultation could advance me a lot.
Additionally, I found a working solution for dataset imbalance. The idea of the solution is to add originally unpopular books as if they have no success after translation. This is ugly. Yet it definitely leads to better results.
Today I had a discussion with my first potential client. It was both quite hard and reassuring. It was hard because the person on the other side said: "OK, assuming you could deliver what we are discussing. So, what do you want?" I wasn't ready to actually discuss it. Then he said that they actually would pay for it would be reasonably priced.
And right after that "You know, this is an information service, so we would like to have exclusivity on it". At this point, I was completely stuck.
Does it mean that I need to do something exclusively for them? What could I possibly answer?
Luckily, it was not a call, but a WhatsApp chat. It was natural to take some time to think about it.
After this, I took part in IH global meetup. And I actually ask my specific question. What could I ask? And get some ideas.
It suddenly unstuck me. I actually started to think "What would be good for the client"? I quickly come to an idea that it's actually a good idea to give information exclusively. But it could be exclusive by genre or even exclusive on book level. It doesn't necessarily imply working with one publisher.
It was a huge relief, thanks for this, IH community.
Finally, I come to a point where I have some data to test on. The results, however, look controversial.
The issue is probably with having mostly very popular books in the train set.
Now, it's not clear if the model is working at least OK, and it's hard to judge by myself. It's not clear if I really want to present something I not sure of.
Thinking a lot.
After some scraping, I found that Amazon started to response with Robot check page after about 1000 queries.
I don't really want to get into it right now. Luckily, I have an alternative.
Setup a proper time-based validation: Now I build a model with historical data before 2018. Validate a model with data of the years 2018 and 2019.
More than an hour I believed that the model has no predictive power according to new validation. And then I found a bug. Phew!
Currently, the best feature for a model is the number of editions. No surprise. This feature is obviously correlated with the number of countries book was published in.
The other interesting thing is that the average book rating has little correlation with book translation popularity.
Now it's close to ready for an alpha test.
I want to make book publishing effective by providing technical and analytical tools